Note

Go to the end to download the full example code.

Acoustic Pedestrian Detection Tutorial¶

!pip install pysensing

In this tutorial, we will be implementing codes for acoustic Human pose estimation

import matplotlib.pyplot as plt
from torch.utils.data import DataLoader
import numpy as np
import pysensing.acoustic.preprocessing.transform as transform
from pysensing.acoustic.inference.utils import *
from pysensing.acoustic.datasets.ped_det import AVPed,AFPILD
from pysensing.acoustic.models.ped_det import PED_CNN,PED_CRNN
from pysensing.acoustic.models.get_model import load_ped_det_model
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness¶

Reimplementation of “AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness”.

This dataset contains the footstep sound of the pedestains which used for pedestrian localization..

Note: Different from original paper which utilizes both audio and visual data to train the network. This library only focuses on using only audio data for pedestrian localization.

The dataset can be downloaded from https://github.com/yizhuoyang/AV-PedAware

Load the data¶

The dataset can be downloaded from this github repo: https://github.com/yizhuoyang/AV-PedAware

root = './data' # The path contains the AVPed dataset
avped_traindataset = AVPed(root,'train')
avped_testdataset = AVPed(root,'test')
index = 20
# Randomly select an index
spectrogram,position,lidar= avped_traindataset.__getitem__(index)
plt.figure(figsize=(5,3))
plt.imshow(spectrogram.numpy()[0])
plt.title("Spectrogram")
plt.show()

Load model¶

# Method 1:
avped_model = PED_CNN(0.2).to(device)
# Method 2:
avped_model = load_ped_det_model('ped_cnn',pretrained=True).to(device)

Modle Training and Testing¶

# Model training
from pysensing.acoustic.inference.training.ped_det_train import *
avped_trainloader = DataLoader(avped_traindataset,batch_size=64,shuffle=True,drop_last=True)
avped_testloader  = DataLoader(avped_traindataset,batch_size=64,shuffle=True,drop_last=True)
epoch = 1
optimizer = torch.optim.Adam(avped_model.parameters(), 0.001)
loss = ped_det_train_val(avped_model,avped_trainloader,avped_testloader, epoch, optimizer, device, save_dir='/data',save = False)

# Model testing
loss = ped_det_test(avped_model,avped_testloader,  device)

Train round0/1:   0%|          | 0/152 [00:00<?, ?batch/s]
Train round0/1:   1%|          | 1/152 [00:00<00:39,  3.79batch/s]
Train round0/1:   1%|▏         | 2/152 [00:00<00:26,  5.59batch/s]
Train round0/1:   2%|▏         | 3/152 [00:00<00:24,  6.00batch/s]
Train round0/1:   3%|▎         | 4/152 [00:00<00:23,  6.17batch/s]
Train round0/1:   3%|▎         | 5/152 [00:00<00:23,  6.35batch/s]
Train round0/1:   4%|▍         | 6/152 [00:00<00:23,  6.32batch/s]
Train round0/1:   5%|▍         | 7/152 [00:01<00:22,  6.50batch/s]
Train round0/1:   5%|▌         | 8/152 [00:01<00:22,  6.47batch/s]
Train round0/1:   6%|▌         | 9/152 [00:01<00:21,  6.68batch/s]
Train round0/1:   7%|▋         | 10/152 [00:01<00:21,  6.72batch/s]
Train round0/1:   7%|▋         | 11/152 [00:01<00:21,  6.68batch/s]
Train round0/1:   8%|▊         | 12/152 [00:01<00:21,  6.65batch/s]
Train round0/1:   9%|▊         | 13/152 [00:02<00:20,  6.94batch/s]
Train round0/1:   9%|▉         | 14/152 [00:02<00:20,  6.77batch/s]
Train round0/1:  10%|▉         | 15/152 [00:02<00:20,  6.67batch/s]
Train round0/1:  11%|█         | 16/152 [00:02<00:20,  6.79batch/s]
Train round0/1:  11%|█         | 17/152 [00:02<00:19,  7.00batch/s]
Train round0/1:  12%|█▏        | 18/152 [00:02<00:19,  6.89batch/s]
Train round0/1:  12%|█▎        | 19/152 [00:02<00:19,  6.82batch/s]
Train round0/1:  13%|█▎        | 20/152 [00:03<00:18,  6.96batch/s]
Train round0/1:  14%|█▍        | 21/152 [00:03<00:19,  6.84batch/s]
Train round0/1:  14%|█▍        | 22/152 [00:03<00:19,  6.79batch/s]
Train round0/1:  15%|█▌        | 23/152 [00:03<00:19,  6.72batch/s]
Train round0/1:  16%|█▌        | 24/152 [00:03<00:18,  7.02batch/s]
Train round0/1:  16%|█▋        | 25/152 [00:03<00:18,  6.85batch/s]
Train round0/1:  17%|█▋        | 26/152 [00:03<00:18,  6.89batch/s]
Train round0/1:  18%|█▊        | 27/152 [00:04<00:18,  6.75batch/s]
Train round0/1:  18%|█▊        | 28/152 [00:04<00:18,  6.84batch/s]
Train round0/1:  19%|█▉        | 29/152 [00:04<00:18,  6.77batch/s]
Train round0/1:  20%|█▉        | 30/152 [00:04<00:17,  7.09batch/s]
Train round0/1:  20%|██        | 31/152 [00:04<00:17,  7.07batch/s]
Train round0/1:  21%|██        | 32/152 [00:04<00:17,  6.89batch/s]
Train round0/1:  22%|██▏       | 33/152 [00:04<00:16,  7.06batch/s]
Train round0/1:  22%|██▏       | 34/152 [00:05<00:17,  6.82batch/s]
Train round0/1:  23%|██▎       | 35/152 [00:05<00:16,  6.92batch/s]
Train round0/1:  24%|██▎       | 36/152 [00:05<00:17,  6.82batch/s]
Train round0/1:  24%|██▍       | 37/152 [00:05<00:17,  6.73batch/s]
Train round0/1:  25%|██▌       | 38/152 [00:05<00:17,  6.62batch/s]
Train round0/1:  26%|██▌       | 39/152 [00:05<00:17,  6.63batch/s]
Train round0/1:  26%|██▋       | 40/152 [00:05<00:16,  6.62batch/s]
Train round0/1:  27%|██▋       | 41/152 [00:06<00:15,  7.17batch/s]
Train round0/1:  28%|██▊       | 42/152 [00:06<00:15,  7.17batch/s]
Train round0/1:  28%|██▊       | 43/152 [00:06<00:14,  7.42batch/s]
Train round0/1:  29%|██▉       | 44/152 [00:06<00:15,  7.08batch/s]
Train round0/1:  30%|██▉       | 45/152 [00:06<00:15,  6.88batch/s]
Train round0/1:  30%|███       | 46/152 [00:06<00:15,  6.69batch/s]
Train round0/1:  31%|███       | 47/152 [00:06<00:15,  6.66batch/s]
Train round0/1:  32%|███▏      | 48/152 [00:07<00:15,  6.54batch/s]
Train round0/1:  32%|███▏      | 49/152 [00:07<00:15,  6.53batch/s]
Train round0/1:  33%|███▎      | 50/152 [00:07<00:15,  6.74batch/s]
Train round0/1:  34%|███▎      | 51/152 [00:07<00:15,  6.64batch/s]
Train round0/1:  34%|███▍      | 52/152 [00:07<00:15,  6.59batch/s]
Train round0/1:  35%|███▍      | 53/152 [00:07<00:14,  6.61batch/s]
Train round0/1:  36%|███▌      | 54/152 [00:08<00:14,  6.59batch/s]
Train round0/1:  36%|███▌      | 55/152 [00:08<00:15,  6.18batch/s]
Train round0/1:  37%|███▋      | 56/152 [00:08<00:15,  6.02batch/s]
Train round0/1:  38%|███▊      | 57/152 [00:08<00:17,  5.52batch/s]
Train round0/1:  38%|███▊      | 58/152 [00:08<00:18,  5.16batch/s]
Train round0/1:  39%|███▉      | 59/152 [00:09<00:18,  5.05batch/s]
Train round0/1:  39%|███▉      | 60/152 [00:09<00:19,  4.83batch/s]
Train round0/1:  40%|████      | 61/152 [00:09<00:19,  4.73batch/s]
Train round0/1:  41%|████      | 62/152 [00:09<00:19,  4.54batch/s]
Train round0/1:  41%|████▏     | 63/152 [00:09<00:19,  4.56batch/s]
Train round0/1:  42%|████▏     | 64/152 [00:10<00:18,  4.75batch/s]
Train round0/1:  43%|████▎     | 65/152 [00:10<00:18,  4.69batch/s]
Train round0/1:  43%|████▎     | 66/152 [00:10<00:17,  4.91batch/s]
Train round0/1:  44%|████▍     | 67/152 [00:10<00:17,  4.88batch/s]
Train round0/1:  45%|████▍     | 68/152 [00:10<00:17,  4.84batch/s]
Train round0/1:  45%|████▌     | 69/152 [00:11<00:16,  4.90batch/s]
Train round0/1:  46%|████▌     | 70/152 [00:11<00:16,  4.86batch/s]
Train round0/1:  47%|████▋     | 71/152 [00:11<00:16,  4.81batch/s]
Train round0/1:  47%|████▋     | 72/152 [00:11<00:17,  4.62batch/s]
Train round0/1:  48%|████▊     | 73/152 [00:12<00:17,  4.44batch/s]
Train round0/1:  49%|████▊     | 74/152 [00:12<00:17,  4.40batch/s]
Train round0/1:  49%|████▉     | 75/152 [00:12<00:17,  4.35batch/s]
Train round0/1:  50%|█████     | 76/152 [00:12<00:17,  4.39batch/s]
Train round0/1:  51%|█████     | 77/152 [00:12<00:17,  4.37batch/s]
Train round0/1:  51%|█████▏    | 78/152 [00:13<00:16,  4.62batch/s]
Train round0/1:  52%|█████▏    | 79/152 [00:13<00:16,  4.49batch/s]
Train round0/1:  53%|█████▎    | 80/152 [00:13<00:16,  4.40batch/s]
Train round0/1:  53%|█████▎    | 81/152 [00:13<00:15,  4.62batch/s]
Train round0/1:  54%|█████▍    | 82/152 [00:14<00:15,  4.59batch/s]
Train round0/1:  55%|█████▍    | 83/152 [00:14<00:14,  4.63batch/s]
Train round0/1:  55%|█████▌    | 84/152 [00:14<00:13,  4.93batch/s]
Train round0/1:  56%|█████▌    | 85/152 [00:14<00:13,  4.98batch/s]
Train round0/1:  57%|█████▋    | 86/152 [00:14<00:12,  5.19batch/s]
Train round0/1:  57%|█████▋    | 87/152 [00:15<00:12,  5.16batch/s]
Train round0/1:  58%|█████▊    | 88/152 [00:15<00:12,  5.13batch/s]
Train round0/1:  59%|█████▊    | 89/152 [00:15<00:12,  5.09batch/s]
Train round0/1:  59%|█████▉    | 90/152 [00:15<00:12,  5.04batch/s]
Train round0/1:  60%|█████▉    | 91/152 [00:15<00:11,  5.09batch/s]
Train round0/1:  61%|██████    | 92/152 [00:15<00:11,  5.25batch/s]
Train round0/1:  61%|██████    | 93/152 [00:16<00:10,  5.42batch/s]
Train round0/1:  62%|██████▏   | 94/152 [00:16<00:10,  5.78batch/s]
Train round0/1:  62%|██████▎   | 95/152 [00:16<00:09,  5.81batch/s]
Train round0/1:  63%|██████▎   | 96/152 [00:16<00:10,  5.60batch/s]
Train round0/1:  64%|██████▍   | 97/152 [00:16<00:09,  5.55batch/s]
Train round0/1:  64%|██████▍   | 98/152 [00:17<00:10,  5.32batch/s]
Train round0/1:  65%|██████▌   | 99/152 [00:17<00:10,  4.97batch/s]
Train round0/1:  66%|██████▌   | 100/152 [00:17<00:10,  4.97batch/s]
Train round0/1:  66%|██████▋   | 101/152 [00:17<00:10,  4.92batch/s]
Train round0/1:  67%|██████▋   | 102/152 [00:17<00:10,  4.79batch/s]
Train round0/1:  68%|██████▊   | 103/152 [00:18<00:08,  5.47batch/s]
Train round0/1:  68%|██████▊   | 104/152 [00:18<00:08,  5.76batch/s]
Train round0/1:  69%|██████▉   | 105/152 [00:18<00:07,  5.96batch/s]
Train round0/1:  70%|██████▉   | 106/152 [00:18<00:07,  6.11batch/s]
Train round0/1:  70%|███████   | 107/152 [00:18<00:07,  6.30batch/s]
Train round0/1:  71%|███████   | 108/152 [00:18<00:06,  6.37batch/s]
Train round0/1:  72%|███████▏  | 109/152 [00:18<00:06,  6.40batch/s]
Train round0/1:  72%|███████▏  | 110/152 [00:19<00:06,  6.59batch/s]
Train round0/1:  73%|███████▎  | 111/152 [00:19<00:06,  6.58batch/s]
Train round0/1:  74%|███████▎  | 112/152 [00:19<00:06,  6.53batch/s]
Train round0/1:  74%|███████▍  | 113/152 [00:19<00:06,  6.50batch/s]
Train round0/1:  75%|███████▌  | 114/152 [00:19<00:05,  6.43batch/s]
Train round0/1:  76%|███████▌  | 115/152 [00:19<00:05,  6.47batch/s]
Train round0/1:  76%|███████▋  | 116/152 [00:20<00:05,  6.60batch/s]
Train round0/1:  77%|███████▋  | 117/152 [00:20<00:05,  6.61batch/s]
Train round0/1:  78%|███████▊  | 118/152 [00:20<00:05,  6.50batch/s]
Train round0/1:  78%|███████▊  | 119/152 [00:20<00:05,  6.44batch/s]
Train round0/1:  79%|███████▉  | 120/152 [00:20<00:04,  6.40batch/s]
Train round0/1:  80%|███████▉  | 121/152 [00:20<00:04,  6.39batch/s]
Train round0/1:  80%|████████  | 122/152 [00:20<00:04,  6.35batch/s]
Train round0/1:  81%|████████  | 123/152 [00:21<00:04,  6.43batch/s]
Train round0/1:  82%|████████▏ | 124/152 [00:21<00:04,  6.49batch/s]
Train round0/1:  82%|████████▏ | 125/152 [00:21<00:04,  6.47batch/s]
Train round0/1:  83%|████████▎ | 126/152 [00:21<00:04,  6.37batch/s]
Train round0/1:  84%|████████▎ | 127/152 [00:21<00:03,  6.55batch/s]
Train round0/1:  84%|████████▍ | 128/152 [00:21<00:03,  6.46batch/s]
Train round0/1:  85%|████████▍ | 129/152 [00:22<00:03,  6.49batch/s]
Train round0/1:  86%|████████▌ | 130/152 [00:22<00:03,  6.47batch/s]
Train round0/1:  86%|████████▌ | 131/152 [00:22<00:03,  6.57batch/s]
Train round0/1:  87%|████████▋ | 132/152 [00:22<00:03,  6.46batch/s]
Train round0/1:  88%|████████▊ | 133/152 [00:22<00:02,  6.46batch/s]
Train round0/1:  88%|████████▊ | 134/152 [00:22<00:02,  6.75batch/s]
Train round0/1:  89%|████████▉ | 135/152 [00:22<00:02,  6.74batch/s]
Train round0/1:  89%|████████▉ | 136/152 [00:23<00:02,  6.64batch/s]
Train round0/1:  90%|█████████ | 137/152 [00:23<00:02,  6.62batch/s]
Train round0/1:  91%|█████████ | 138/152 [00:23<00:02,  6.79batch/s]
Train round0/1:  91%|█████████▏| 139/152 [00:23<00:01,  6.79batch/s]
Train round0/1:  92%|█████████▏| 140/152 [00:23<00:01,  6.66batch/s]
Train round0/1:  93%|█████████▎| 141/152 [00:23<00:01,  6.77batch/s]
Train round0/1:  93%|█████████▎| 142/152 [00:23<00:01,  6.95batch/s]
Train round0/1:  94%|█████████▍| 143/152 [00:24<00:01,  6.81batch/s]
Train round0/1:  95%|█████████▍| 144/152 [00:24<00:01,  6.71batch/s]
Train round0/1:  95%|█████████▌| 145/152 [00:24<00:01,  6.68batch/s]
Train round0/1:  96%|█████████▌| 146/152 [00:24<00:00,  6.64batch/s]
Train round0/1:  97%|█████████▋| 147/152 [00:24<00:00,  6.56batch/s]
Train round0/1:  97%|█████████▋| 148/152 [00:24<00:00,  6.65batch/s]
Train round0/1:  98%|█████████▊| 149/152 [00:25<00:00,  6.64batch/s]
Train round0/1:  99%|█████████▊| 150/152 [00:25<00:00,  6.49batch/s]
Train round0/1:  99%|█████████▉| 151/152 [00:25<00:00,  6.44batch/s]
Train round0/1: 100%|██████████| 152/152 [00:25<00:00,  6.60batch/s]

Epoch:1, Train loss:nan

Test round:   0%|          | 0/152 [00:00<?, ?batch/s]
Test round:   1%|          | 1/152 [00:00<00:19,  7.80batch/s]
Test round:   1%|▏         | 2/152 [00:00<00:19,  7.61batch/s]
Test round:   2%|▏         | 3/152 [00:00<00:19,  7.66batch/s]
Test round:   3%|▎         | 4/152 [00:00<00:19,  7.67batch/s]
Test round:   3%|▎         | 5/152 [00:00<00:19,  7.50batch/s]
Test round:   4%|▍         | 6/152 [00:00<00:19,  7.57batch/s]
Test round:   5%|▍         | 7/152 [00:00<00:19,  7.61batch/s]
Test round:   5%|▌         | 8/152 [00:01<00:18,  7.64batch/s]
Test round:   6%|▌         | 9/152 [00:01<00:18,  7.66batch/s]
Test round:   7%|▋         | 10/152 [00:01<00:18,  7.48batch/s]
Test round:   7%|▋         | 11/152 [00:01<00:18,  7.51batch/s]
Test round:   8%|▊         | 12/152 [00:01<00:18,  7.50batch/s]
Test round:   9%|▊         | 13/152 [00:01<00:18,  7.40batch/s]
Test round:   9%|▉         | 14/152 [00:01<00:18,  7.48batch/s]
Test round:  10%|▉         | 15/152 [00:01<00:18,  7.54batch/s]
Test round:  11%|█         | 16/152 [00:02<00:17,  7.59batch/s]
Test round:  11%|█         | 17/152 [00:02<00:18,  7.42batch/s]
Test round:  12%|█▏        | 18/152 [00:02<00:17,  7.49batch/s]
Test round:  12%|█▎        | 19/152 [00:02<00:17,  7.51batch/s]
Test round:  13%|█▎        | 20/152 [00:02<00:17,  7.57batch/s]
Test round:  14%|█▍        | 21/152 [00:02<00:17,  7.61batch/s]
Test round:  14%|█▍        | 22/152 [00:02<00:17,  7.62batch/s]
Test round:  15%|█▌        | 23/152 [00:03<00:17,  7.48batch/s]
Test round:  16%|█▌        | 24/152 [00:03<00:17,  7.39batch/s]
Test round:  16%|█▋        | 25/152 [00:03<00:16,  7.48batch/s]
Test round:  17%|█▋        | 26/152 [00:03<00:16,  7.54batch/s]
Test round:  18%|█▊        | 27/152 [00:03<00:16,  7.59batch/s]
Test round:  18%|█▊        | 28/152 [00:03<00:17,  7.15batch/s]
Test round:  19%|█▉        | 29/152 [00:03<00:16,  7.26batch/s]
Test round:  20%|█▉        | 30/152 [00:04<00:16,  7.34batch/s]
Test round:  20%|██        | 31/152 [00:04<00:16,  7.25batch/s]
Test round:  21%|██        | 32/152 [00:04<00:16,  7.33batch/s]
Test round:  22%|██▏       | 33/152 [00:04<00:16,  7.39batch/s]
Test round:  22%|██▏       | 34/152 [00:04<00:15,  7.43batch/s]
Test round:  23%|██▎       | 35/152 [00:04<00:16,  7.29batch/s]
Test round:  24%|██▎       | 36/152 [00:04<00:15,  7.28batch/s]
Test round:  24%|██▍       | 37/152 [00:04<00:15,  7.26batch/s]
Test round:  25%|██▌       | 38/152 [00:05<00:15,  7.28batch/s]
Test round:  26%|██▌       | 39/152 [00:05<00:15,  7.28batch/s]
Test round:  26%|██▋       | 40/152 [00:05<00:15,  7.40batch/s]
Test round:  27%|██▋       | 41/152 [00:05<00:15,  7.32batch/s]
Test round:  28%|██▊       | 42/152 [00:05<00:15,  7.28batch/s]
Test round:  28%|██▊       | 43/152 [00:05<00:14,  7.39batch/s]
Test round:  29%|██▉       | 44/152 [00:05<00:14,  7.31batch/s]
Test round:  30%|██▉       | 45/152 [00:06<00:14,  7.26batch/s]
Test round:  30%|███       | 46/152 [00:06<00:14,  7.37batch/s]
Test round:  31%|███       | 47/152 [00:06<00:14,  7.31batch/s]
Test round:  32%|███▏      | 48/152 [00:06<00:13,  7.43batch/s]
Test round:  32%|███▏      | 49/152 [00:06<00:13,  7.52batch/s]
Test round:  33%|███▎      | 50/152 [00:06<00:13,  7.57batch/s]
Test round:  34%|███▎      | 51/152 [00:06<00:13,  7.62batch/s]
Test round:  34%|███▍      | 52/152 [00:06<00:13,  7.64batch/s]
Test round:  35%|███▍      | 53/152 [00:07<00:12,  7.66batch/s]
Test round:  36%|███▌      | 54/152 [00:07<00:12,  7.68batch/s]
Test round:  36%|███▌      | 55/152 [00:07<00:12,  7.48batch/s]
Test round:  37%|███▋      | 56/152 [00:07<00:12,  7.53batch/s]
Test round:  38%|███▊      | 57/152 [00:07<00:12,  7.56batch/s]
Test round:  38%|███▊      | 58/152 [00:07<00:12,  7.57batch/s]
Test round:  39%|███▉      | 59/152 [00:07<00:12,  7.60batch/s]
Test round:  39%|███▉      | 60/152 [00:08<00:12,  7.47batch/s]
Test round:  40%|████      | 61/152 [00:08<00:12,  7.52batch/s]
Test round:  41%|████      | 62/152 [00:08<00:11,  7.57batch/s]
Test round:  41%|████▏     | 63/152 [00:08<00:11,  7.60batch/s]
Test round:  42%|████▏     | 64/152 [00:08<00:11,  7.63batch/s]
Test round:  43%|████▎     | 65/152 [00:08<00:11,  7.64batch/s]
Test round:  43%|████▎     | 66/152 [00:08<00:11,  7.66batch/s]
Test round:  44%|████▍     | 67/152 [00:08<00:11,  7.52batch/s]
Test round:  45%|████▍     | 68/152 [00:09<00:11,  7.58batch/s]
Test round:  45%|████▌     | 69/152 [00:09<00:11,  7.46batch/s]
Test round:  46%|████▌     | 70/152 [00:09<00:10,  7.53batch/s]
Test round:  47%|████▋     | 71/152 [00:09<00:10,  7.42batch/s]
Test round:  47%|████▋     | 72/152 [00:09<00:10,  7.49batch/s]
Test round:  48%|████▊     | 73/152 [00:09<00:10,  7.55batch/s]
Test round:  49%|████▊     | 74/152 [00:09<00:10,  7.42batch/s]
Test round:  49%|████▉     | 75/152 [00:10<00:10,  7.49batch/s]
Test round:  50%|█████     | 76/152 [00:10<00:10,  7.54batch/s]
Test round:  51%|█████     | 77/152 [00:10<00:09,  7.58batch/s]
Test round:  51%|█████▏    | 78/152 [00:10<00:09,  7.61batch/s]
Test round:  52%|█████▏    | 79/152 [00:10<00:09,  7.47batch/s]
Test round:  53%|█████▎    | 80/152 [00:10<00:09,  7.52batch/s]
Test round:  53%|█████▎    | 81/152 [00:10<00:09,  7.41batch/s]
Test round:  54%|█████▍    | 82/152 [00:10<00:09,  7.48batch/s]
Test round:  55%|█████▍    | 83/152 [00:11<00:09,  7.52batch/s]
Test round:  55%|█████▌    | 84/152 [00:11<00:08,  7.56batch/s]
Test round:  56%|█████▌    | 85/152 [00:11<00:08,  7.59batch/s]
Test round:  57%|█████▋    | 86/152 [00:11<00:08,  7.37batch/s]
Test round:  57%|█████▋    | 87/152 [00:11<00:08,  7.46batch/s]
Test round:  58%|█████▊    | 88/152 [00:11<00:08,  7.47batch/s]
Test round:  59%|█████▊    | 89/152 [00:11<00:08,  7.53batch/s]
Test round:  59%|█████▉    | 90/152 [00:12<00:08,  7.56batch/s]
Test round:  60%|█████▉    | 91/152 [00:12<00:08,  7.60batch/s]
Test round:  61%|██████    | 92/152 [00:12<00:07,  7.62batch/s]
Test round:  61%|██████    | 93/152 [00:12<00:07,  7.63batch/s]
Test round:  62%|██████▏   | 94/152 [00:12<00:07,  7.65batch/s]
Test round:  62%|██████▎   | 95/152 [00:12<00:07,  7.66batch/s]
Test round:  63%|██████▎   | 96/152 [00:12<00:07,  7.50batch/s]
Test round:  64%|██████▍   | 97/152 [00:12<00:07,  7.55batch/s]
Test round:  64%|██████▍   | 98/152 [00:13<00:07,  7.58batch/s]
Test round:  65%|██████▌   | 99/152 [00:13<00:06,  7.61batch/s]
Test round:  66%|██████▌   | 100/152 [00:13<00:06,  7.63batch/s]
Test round:  66%|██████▋   | 101/152 [00:13<00:06,  7.49batch/s]
Test round:  67%|██████▋   | 102/152 [00:13<00:06,  7.54batch/s]
Test round:  68%|██████▊   | 103/152 [00:13<00:06,  7.58batch/s]
Test round:  68%|██████▊   | 104/152 [00:13<00:06,  7.60batch/s]
Test round:  69%|██████▉   | 105/152 [00:13<00:06,  7.62batch/s]
Test round:  70%|██████▉   | 106/152 [00:14<00:06,  7.63batch/s]
Test round:  70%|███████   | 107/152 [00:14<00:06,  7.49batch/s]
Test round:  71%|███████   | 108/152 [00:14<00:05,  7.53batch/s]
Test round:  72%|███████▏  | 109/152 [00:14<00:05,  7.57batch/s]
Test round:  72%|███████▏  | 110/152 [00:14<00:05,  7.59batch/s]
Test round:  73%|███████▎  | 111/152 [00:14<00:05,  7.62batch/s]
Test round:  74%|███████▎  | 112/152 [00:14<00:05,  7.63batch/s]
Test round:  74%|███████▍  | 113/152 [00:15<00:05,  7.64batch/s]
Test round:  75%|███████▌  | 114/152 [00:15<00:04,  7.63batch/s]
Test round:  76%|███████▌  | 115/152 [00:15<00:04,  7.49batch/s]
Test round:  76%|███████▋  | 116/152 [00:15<00:04,  7.55batch/s]
Test round:  77%|███████▋  | 117/152 [00:15<00:04,  7.59batch/s]
Test round:  78%|███████▊  | 118/152 [00:15<00:04,  7.59batch/s]
Test round:  78%|███████▊  | 119/152 [00:15<00:04,  7.43batch/s]
Test round:  79%|███████▉  | 120/152 [00:15<00:04,  7.50batch/s]
Test round:  80%|███████▉  | 121/152 [00:16<00:04,  7.40batch/s]
Test round:  80%|████████  | 122/152 [00:16<00:04,  7.46batch/s]
Test round:  81%|████████  | 123/152 [00:16<00:03,  7.53batch/s]
Test round:  82%|████████▏ | 124/152 [00:16<00:03,  7.57batch/s]
Test round:  82%|████████▏ | 125/152 [00:16<00:03,  7.60batch/s]
Test round:  83%|████████▎ | 126/152 [00:16<00:03,  7.63batch/s]
Test round:  84%|████████▎ | 127/152 [00:16<00:03,  7.64batch/s]
Test round:  84%|████████▍ | 128/152 [00:17<00:03,  7.65batch/s]
Test round:  85%|████████▍ | 129/152 [00:17<00:02,  7.67batch/s]
Test round:  86%|████████▌ | 130/152 [00:17<00:02,  7.69batch/s]
Test round:  86%|████████▌ | 131/152 [00:17<00:02,  7.55batch/s]
Test round:  87%|████████▋ | 132/152 [00:17<00:02,  7.60batch/s]
Test round:  88%|████████▊ | 133/152 [00:17<00:02,  7.47batch/s]
Test round:  88%|████████▊ | 134/152 [00:17<00:02,  7.38batch/s]
Test round:  89%|████████▉ | 135/152 [00:17<00:02,  7.47batch/s]
Test round:  89%|████████▉ | 136/152 [00:18<00:02,  7.37batch/s]
Test round:  90%|█████████ | 137/152 [00:18<00:02,  7.45batch/s]
Test round:  91%|█████████ | 138/152 [00:18<00:01,  7.50batch/s]
Test round:  91%|█████████▏| 139/152 [00:18<00:01,  7.55batch/s]
Test round:  92%|█████████▏| 140/152 [00:18<00:01,  7.59batch/s]
Test round:  93%|█████████▎| 141/152 [00:18<00:01,  7.47batch/s]
Test round:  93%|█████████▎| 142/152 [00:18<00:01,  7.53batch/s]
Test round:  94%|█████████▍| 143/152 [00:19<00:01,  7.58batch/s]
Test round:  95%|█████████▍| 144/152 [00:19<00:01,  7.62batch/s]
Test round:  95%|█████████▌| 145/152 [00:19<00:00,  7.57batch/s]
Test round:  96%|█████████▌| 146/152 [00:19<00:00,  7.61batch/s]
Test round:  97%|█████████▋| 147/152 [00:19<00:00,  7.63batch/s]
Test round:  97%|█████████▋| 148/152 [00:19<00:00,  7.49batch/s]
Test round:  98%|█████████▊| 149/152 [00:19<00:00,  7.51batch/s]
Test round:  99%|█████████▊| 150/152 [00:19<00:00,  7.57batch/s]
Test round:  99%|█████████▉| 151/152 [00:20<00:00,  7.61batch/s]
Test round: 100%|██████████| 152/152 [00:20<00:00,  7.65batch/s]

Test loss:nan

Test round:   0%|          | 0/152 [00:00<?, ?batch/s]
Test round:   1%|          | 1/152 [00:00<00:19,  7.81batch/s]
Test round:   1%|▏         | 2/152 [00:00<00:19,  7.68batch/s]
Test round:   2%|▏         | 3/152 [00:00<00:19,  7.70batch/s]
Test round:   3%|▎         | 4/152 [00:00<00:19,  7.71batch/s]
Test round:   3%|▎         | 5/152 [00:00<00:19,  7.71batch/s]
Test round:   4%|▍         | 6/152 [00:00<00:18,  7.70batch/s]
Test round:   5%|▍         | 7/152 [00:00<00:18,  7.70batch/s]
Test round:   5%|▌         | 8/152 [00:01<00:18,  7.69batch/s]
Test round:   6%|▌         | 9/152 [00:01<00:18,  7.69batch/s]
Test round:   7%|▋         | 10/152 [00:01<00:18,  7.68batch/s]
Test round:   7%|▋         | 11/152 [00:01<00:18,  7.69batch/s]
Test round:   8%|▊         | 12/152 [00:01<00:18,  7.69batch/s]
Test round:   9%|▊         | 13/152 [00:01<00:18,  7.69batch/s]
Test round:   9%|▉         | 14/152 [00:01<00:17,  7.69batch/s]
Test round:  10%|▉         | 15/152 [00:01<00:17,  7.68batch/s]
Test round:  11%|█         | 16/152 [00:02<00:17,  7.67batch/s]
Test round:  11%|█         | 17/152 [00:02<00:17,  7.66batch/s]
Test round:  12%|█▏        | 18/152 [00:02<00:17,  7.66batch/s]
Test round:  12%|█▎        | 19/152 [00:02<00:17,  7.66batch/s]
Test round:  13%|█▎        | 20/152 [00:02<00:17,  7.67batch/s]
Test round:  14%|█▍        | 21/152 [00:02<00:17,  7.68batch/s]
Test round:  14%|█▍        | 22/152 [00:02<00:16,  7.68batch/s]
Test round:  15%|█▌        | 23/152 [00:02<00:16,  7.68batch/s]
Test round:  16%|█▌        | 24/152 [00:03<00:16,  7.66batch/s]
Test round:  16%|█▋        | 25/152 [00:03<00:16,  7.66batch/s]
Test round:  17%|█▋        | 26/152 [00:03<00:16,  7.67batch/s]
Test round:  18%|█▊        | 27/152 [00:03<00:16,  7.68batch/s]
Test round:  18%|█▊        | 28/152 [00:03<00:16,  7.68batch/s]
Test round:  19%|█▉        | 29/152 [00:03<00:15,  7.70batch/s]
Test round:  20%|█▉        | 30/152 [00:03<00:15,  7.70batch/s]
Test round:  20%|██        | 31/152 [00:04<00:15,  7.70batch/s]
Test round:  21%|██        | 32/152 [00:04<00:15,  7.71batch/s]
Test round:  22%|██▏       | 33/152 [00:04<00:15,  7.72batch/s]
Test round:  22%|██▏       | 34/152 [00:04<00:15,  7.72batch/s]
Test round:  23%|██▎       | 35/152 [00:04<00:15,  7.72batch/s]
Test round:  24%|██▎       | 36/152 [00:04<00:15,  7.72batch/s]
Test round:  24%|██▍       | 37/152 [00:04<00:14,  7.72batch/s]
Test round:  25%|██▌       | 38/152 [00:04<00:14,  7.73batch/s]
Test round:  26%|██▌       | 39/152 [00:05<00:14,  7.73batch/s]
Test round:  26%|██▋       | 40/152 [00:05<00:14,  7.73batch/s]
Test round:  27%|██▋       | 41/152 [00:05<00:14,  7.70batch/s]
Test round:  28%|██▊       | 42/152 [00:05<00:14,  7.69batch/s]
Test round:  28%|██▊       | 43/152 [00:05<00:14,  7.68batch/s]
Test round:  29%|██▉       | 44/152 [00:05<00:14,  7.67batch/s]
Test round:  30%|██▉       | 45/152 [00:05<00:13,  7.65batch/s]
Test round:  30%|███       | 46/152 [00:05<00:13,  7.66batch/s]
Test round:  31%|███       | 47/152 [00:06<00:13,  7.66batch/s]
Test round:  32%|███▏      | 48/152 [00:06<00:13,  7.67batch/s]
Test round:  32%|███▏      | 49/152 [00:06<00:13,  7.66batch/s]
Test round:  33%|███▎      | 50/152 [00:06<00:13,  7.67batch/s]
Test round:  34%|███▎      | 51/152 [00:06<00:13,  7.68batch/s]
Test round:  34%|███▍      | 52/152 [00:06<00:13,  7.68batch/s]
Test round:  35%|███▍      | 53/152 [00:06<00:12,  7.68batch/s]
Test round:  36%|███▌      | 54/152 [00:07<00:12,  7.68batch/s]
Test round:  36%|███▌      | 55/152 [00:07<00:12,  7.69batch/s]
Test round:  37%|███▋      | 56/152 [00:07<00:12,  7.69batch/s]
Test round:  38%|███▊      | 57/152 [00:07<00:12,  7.70batch/s]
Test round:  38%|███▊      | 58/152 [00:07<00:12,  7.71batch/s]
Test round:  39%|███▉      | 59/152 [00:07<00:12,  7.69batch/s]
Test round:  39%|███▉      | 60/152 [00:07<00:11,  7.70batch/s]
Test round:  40%|████      | 61/152 [00:07<00:11,  7.70batch/s]
Test round:  41%|████      | 62/152 [00:08<00:11,  7.71batch/s]
Test round:  41%|████▏     | 63/152 [00:08<00:11,  7.70batch/s]
Test round:  42%|████▏     | 64/152 [00:08<00:11,  7.71batch/s]
Test round:  43%|████▎     | 65/152 [00:08<00:11,  7.71batch/s]
Test round:  43%|████▎     | 66/152 [00:08<00:11,  7.72batch/s]
Test round:  44%|████▍     | 67/152 [00:08<00:11,  7.72batch/s]
Test round:  45%|████▍     | 68/152 [00:08<00:10,  7.72batch/s]
Test round:  45%|████▌     | 69/152 [00:08<00:10,  7.72batch/s]
Test round:  46%|████▌     | 70/152 [00:09<00:10,  7.72batch/s]
Test round:  47%|████▋     | 71/152 [00:09<00:10,  7.72batch/s]
Test round:  47%|████▋     | 72/152 [00:09<00:10,  7.72batch/s]
Test round:  48%|████▊     | 73/152 [00:09<00:10,  7.72batch/s]
Test round:  49%|████▊     | 74/152 [00:09<00:10,  7.72batch/s]
Test round:  49%|████▉     | 75/152 [00:09<00:09,  7.72batch/s]
Test round:  50%|█████     | 76/152 [00:09<00:09,  7.72batch/s]
Test round:  51%|█████     | 77/152 [00:10<00:09,  7.72batch/s]
Test round:  51%|█████▏    | 78/152 [00:10<00:09,  7.72batch/s]
Test round:  52%|█████▏    | 79/152 [00:10<00:09,  7.72batch/s]
Test round:  53%|█████▎    | 80/152 [00:10<00:09,  7.71batch/s]
Test round:  53%|█████▎    | 81/152 [00:10<00:09,  7.72batch/s]
Test round:  54%|█████▍    | 82/152 [00:10<00:09,  7.73batch/s]
Test round:  55%|█████▍    | 83/152 [00:10<00:08,  7.72batch/s]
Test round:  55%|█████▌    | 84/152 [00:10<00:08,  7.72batch/s]
Test round:  56%|█████▌    | 85/152 [00:11<00:08,  7.73batch/s]
Test round:  57%|█████▋    | 86/152 [00:11<00:08,  7.72batch/s]
Test round:  57%|█████▋    | 87/152 [00:11<00:08,  7.71batch/s]
Test round:  58%|█████▊    | 88/152 [00:11<00:08,  7.71batch/s]
Test round:  59%|█████▊    | 89/152 [00:11<00:08,  7.70batch/s]
Test round:  59%|█████▉    | 90/152 [00:11<00:08,  7.68batch/s]
Test round:  60%|█████▉    | 91/152 [00:11<00:07,  7.68batch/s]
Test round:  61%|██████    | 92/152 [00:11<00:07,  7.68batch/s]
Test round:  61%|██████    | 93/152 [00:12<00:07,  7.68batch/s]
Test round:  62%|██████▏   | 94/152 [00:12<00:07,  7.67batch/s]
Test round:  62%|██████▎   | 95/152 [00:12<00:07,  7.68batch/s]
Test round:  63%|██████▎   | 96/152 [00:12<00:07,  7.68batch/s]
Test round:  64%|██████▍   | 97/152 [00:12<00:07,  7.68batch/s]
Test round:  64%|██████▍   | 98/152 [00:12<00:07,  7.67batch/s]
Test round:  65%|██████▌   | 99/152 [00:12<00:06,  7.67batch/s]
Test round:  66%|██████▌   | 100/152 [00:13<00:07,  7.42batch/s]
Test round:  66%|██████▋   | 101/152 [00:13<00:06,  7.51batch/s]
Test round:  67%|██████▋   | 102/152 [00:13<00:06,  7.58batch/s]
Test round:  68%|██████▊   | 103/152 [00:13<00:06,  7.61batch/s]
Test round:  68%|██████▊   | 104/152 [00:13<00:06,  7.64batch/s]
Test round:  69%|██████▉   | 105/152 [00:13<00:06,  7.65batch/s]
Test round:  70%|██████▉   | 106/152 [00:13<00:06,  7.66batch/s]
Test round:  70%|███████   | 107/152 [00:13<00:05,  7.67batch/s]
Test round:  71%|███████   | 108/152 [00:14<00:05,  7.68batch/s]
Test round:  72%|███████▏  | 109/152 [00:14<00:05,  7.67batch/s]
Test round:  72%|███████▏  | 110/152 [00:14<00:05,  7.68batch/s]
Test round:  73%|███████▎  | 111/152 [00:14<00:05,  7.68batch/s]
Test round:  74%|███████▎  | 112/152 [00:14<00:05,  7.68batch/s]
Test round:  74%|███████▍  | 113/152 [00:14<00:05,  7.67batch/s]
Test round:  75%|███████▌  | 114/152 [00:14<00:04,  7.68batch/s]
Test round:  76%|███████▌  | 115/152 [00:14<00:04,  7.68batch/s]
Test round:  76%|███████▋  | 116/152 [00:15<00:04,  7.68batch/s]
Test round:  77%|███████▋  | 117/152 [00:15<00:04,  7.68batch/s]
Test round:  78%|███████▊  | 118/152 [00:15<00:04,  7.21batch/s]
Test round:  79%|███████▉  | 120/152 [00:15<00:03,  8.87batch/s]
Test round:  80%|████████  | 122/152 [00:15<00:03,  9.92batch/s]
Test round:  82%|████████▏ | 124/152 [00:15<00:02, 10.54batch/s]
Test round:  83%|████████▎ | 126/152 [00:16<00:02, 11.05batch/s]
Test round:  84%|████████▍ | 128/152 [00:16<00:02, 11.39batch/s]
Test round:  86%|████████▌ | 130/152 [00:16<00:01, 11.62batch/s]
Test round:  87%|████████▋ | 132/152 [00:16<00:01, 11.77batch/s]
Test round:  88%|████████▊ | 134/152 [00:16<00:01, 11.88batch/s]
Test round:  89%|████████▉ | 136/152 [00:16<00:01, 11.91batch/s]
Test round:  91%|█████████ | 138/152 [00:17<00:01, 11.93batch/s]
Test round:  92%|█████████▏| 140/152 [00:17<00:01, 11.96batch/s]
Test round:  93%|█████████▎| 142/152 [00:17<00:00, 12.02batch/s]
Test round:  95%|█████████▍| 144/152 [00:17<00:00, 12.07batch/s]
Test round:  96%|█████████▌| 146/152 [00:17<00:00, 12.12batch/s]
Test round:  97%|█████████▋| 148/152 [00:17<00:00, 12.13batch/s]
Test round:  99%|█████████▊| 150/152 [00:18<00:00, 12.15batch/s]
Test round: 100%|██████████| 152/152 [00:18<00:00, 12.12batch/s]

Test loss:nan

Modle Inference¶

# Method 1
spectrogram,position,lidar= avped_testdataset.__getitem__(1)
avped_model.eval()
#Direct prediction use the model
predicted_result = avped_model(spectrogram.unsqueeze(0).float().to(device))
position = position.unsqueeze(0).numpy()
predicted_result = predicted_result.cpu().detach().numpy()
draw_scenes(lidar,position,predicted_result)

# Method 2
#Use inference.predict
from pysensing.acoustic.inference.predict import *
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
predicted_result  = ped_det_predict(spectrogram,'AVPed',avped_model, device=device)
predicted_result = predicted_result.cpu().detach().numpy()
draw_scenes(lidar,position,predicted_result)

[Open3D WARNING] invalid color in PaintUniformColor, clipping to [0, 1]

Modle Embedding¶

from pysensing.acoustic.inference.embedding import *
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
sample_embedding = ped_det_embedding(spectrogram,'AVPed',avped_model, device=device)

AFPILD: Acoustic footstep dataset collected using one microphone array and LiDAR sensor for person identification and localization¶

Reimplementation of “AFPILD: Acoustic footstep dataset collected using one microphone array and LiDAR sensor for person identification and localization”.

# This dataset contains footstep sound of the pedestains which used for pedestrian localization and classification

Load the data¶

# Method 1: Use get_dataloader
from pysensing.acoustic.datasets.get_dataloader import *
train_loader,test_loader = load_ped_det_dataset(
    root='./data',
    dataset='AFPILD',
    download=True)

# Method 2
root = './data' # The path contains the AFPILD dataset
afpild_traindataset = AFPILD(root,'ideloc_ori_cloth','train')
afpild_testdataset = AFPILD(root,'ideloc_ori_cloth','test')
# Define the Dataloader
afpild_trainloader = DataLoader(afpild_traindataset,batch_size=64,shuffle=True,drop_last=True)
afpild_testloader = DataLoader(afpild_testdataset,batch_size=64,shuffle=True,drop_last=True)
#List the activity classes in the dataset
index = 330
# Randomly select an index
data_dict,label = afpild_testdataset.__getitem__(index)

fig, axs = plt.subplots(1, 2, figsize=(7, 4))

axs[0].imshow(data_dict['spec'][:,:,0], aspect='auto', origin='lower')
axs[0].set_title('Spectrogram')
axs[0].set_xlabel('Time')
axs[0].set_ylabel('Frequency')

axs[1].imshow(data_dict['gcc'][:,:,0], aspect='auto', origin='lower')
axs[1].set_title('GCC')
axs[1].set_xlabel('Time')
axs[1].set_ylabel('Lag')

person_id, angle = label
print(f"Person ID: {person_id}, Angle: {angle:.2f} radians")

using dataset: AFPILD
Person ID: 0.0, Angle: 0.52 radians

Model training¶

from pysensing.acoustic.inference.training.AFPILD_utils.training import afpild_train

afpild_train(
      config_file="./data/AFPILD/afpild_spec_gcc_fusion.json",
      root_dir='./data/AFPILD',
      task='accil_ana_shoe',
      epochs=1,
      num_workers=4,
      dataset_dir='./data/AFPILD/')

warning: logging configuration file is not found in data/AFPILD/logger/logger_config.json.
DEBUG:train:+---------------------+
|TRAIN: accil_ana_shoe|
+---------------------+
INFO:model:CRNN(
  (conv_block_list): ModuleList(
    (0): ConvBlock(
      (conv): Conv2d(10, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (2): Dropout2d(p=0.05, inplace=False)
    (3): ConvBlock(
      (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (4): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (5): Dropout2d(p=0.05, inplace=False)
    (6): ConvBlock(
      (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (7): MaxPool2d(kernel_size=(2, 1), stride=(2, 1), padding=0, dilation=1, ceil_mode=False)
    (8): Dropout2d(p=0.05, inplace=False)
  )
  (gru): GRU(1024, 128, num_layers=2, batch_first=True, dropout=0.05, bidirectional=True)
  (fnn_list): ModuleList(
    (0): Linear(in_features=128, out_features=128, bias=True)
    (1): Linear(in_features=128, out_features=80, bias=True)
  )
)
INFO:train:CRNN(
  (conv_block_list): ModuleList(
    (0): ConvBlock(
      (conv): Conv2d(10, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (2): Dropout2d(p=0.05, inplace=False)
    (3): ConvBlock(
      (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (4): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (5): Dropout2d(p=0.05, inplace=False)
    (6): ConvBlock(
      (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (7): MaxPool2d(kernel_size=(2, 1), stride=(2, 1), padding=0, dilation=1, ceil_mode=False)
    (8): Dropout2d(p=0.05, inplace=False)
  )
  (gru): GRU(1024, 128, num_layers=2, batch_first=True, dropout=0.05, bidirectional=True)
  (fnn_list): ModuleList(
    (0): Linear(in_features=128, out_features=128, bias=True)
    (1): Linear(in_features=128, out_features=80, bias=True)
  )
)
Max LR: 0.001
DEBUG:trainer:Train Epoch: 1 [0/171 (0%)] loss: 0.017025
DEBUG:trainer:Train Epoch: 1 [11/171 (6%)] loss: 0.014435
DEBUG:trainer:Train Epoch: 1 [22/171 (13%)] loss: 0.012931
DEBUG:trainer:Train Epoch: 1 [33/171 (19%)] loss: 0.012686
DEBUG:trainer:Train Epoch: 1 [44/171 (26%)] loss: 0.012567
DEBUG:trainer:Train Epoch: 1 [55/171 (32%)] loss: 0.012573
DEBUG:trainer:Train Epoch: 1 [66/171 (39%)] loss: 0.012599
DEBUG:trainer:Train Epoch: 1 [77/171 (45%)] loss: 0.012555
DEBUG:trainer:Train Epoch: 1 [88/171 (51%)] loss: 0.012523
DEBUG:trainer:Train Epoch: 1 [99/171 (58%)] loss: 0.012532
DEBUG:trainer:Train Epoch: 1 [110/171 (64%)] loss: 0.012507
DEBUG:trainer:Train Epoch: 1 [121/171 (71%)] loss: 0.012506
DEBUG:trainer:Train Epoch: 1 [132/171 (77%)] loss: 0.012444
DEBUG:trainer:Train Epoch: 1 [143/171 (84%)] loss: 0.012454
DEBUG:trainer:Train Epoch: 1 [154/171 (90%)] loss: 0.012457
DEBUG:trainer:Train Epoch: 1 [165/171 (96%)] loss: 0.012463
INFO:trainer:epochs: 1, iterations: 171, Runtime: 00 hours, 00 minutes, 15 seconds
                       mean       std
loss               0.012816  0.001311
mae                 57.3527       NaN
accuracy           0.023862       NaN
mae_d             56.110302       NaN
accuracy_l         0.006354       NaN
accuracy_l30       0.008137       NaN
val_loss           0.012418     0.001
val_mae           53.852333       NaN
val_accuracy       0.027413       NaN
val_mae_d         53.383728       NaN
val_accuracy_l     0.007787       NaN
val_accuracy_l30     0.0112       NaN
best val acc updated with 0.0077866666666666666
best val acc (threshold 30) updated with 0.0112
best val loc updated with 53.38372802734375
INFO:trainer:Best val_loss: 0.01242
INFO:trainer:Saving model: data/AFPILD/saved/AFPILD-CRNN/20250205075140/model/model_best.pth ...
INFO:train:+------+
|result|
+------+
loss                 0.012816
mae                   57.3527
accuracy             0.023862
mae_d               56.110302
accuracy_l           0.006354
accuracy_l30         0.008137
val_loss             0.012418
val_mae             53.852333
val_accuracy         0.027413
val_mae_d           53.383728
val_accuracy_l       0.007787
val_accuracy_l30       0.0112
Name: mean, dtype: object

Model testing¶

from pysensing.acoustic.inference.training.AFPILD_utils.testing import afpild_testing
afpild_testing(
    config_file="./data/AFPILD/afpild_spec_gcc_fusion.json",
    root_dir= "./data/AFPILD",
    dataset_dir="./data/AFPILD/",
    resume_path="./data/AFPILD/saved/AFPILD-CRNN/20241030055348/model/model_best.pth", # Path to the trained model
    task='accil_ori_rd',
)

warning: logging configuration file is not found in data/AFPILD/logger/logger_config.json.
DEBUG:test:+------------------+
|TEST: accil_ori_rd|
+------------------+
INFO:test:Loading model from ./data/AFPILD/saved/AFPILD-CRNN/20241030055348/model/model_best.pth ...
Testing...

  0%|          | 0/243 [00:00<?, ?it/s]
  1%|          | 2/243 [00:00<00:13, 18.46it/s]
  4%|▎         | 9/243 [00:00<00:05, 40.32it/s]
  7%|▋         | 17/243 [00:00<00:04, 46.68it/s]
 10%|█         | 25/243 [00:00<00:04, 50.46it/s]
 14%|█▎        | 33/243 [00:00<00:04, 51.44it/s]
 17%|█▋        | 41/243 [00:00<00:03, 52.77it/s]
 20%|██        | 49/243 [00:00<00:03, 52.53it/s]
 23%|██▎       | 57/243 [00:01<00:03, 53.17it/s]
 27%|██▋       | 65/243 [00:01<00:03, 52.97it/s]
 30%|███       | 73/243 [00:01<00:03, 54.72it/s]
 33%|███▎      | 81/243 [00:01<00:02, 55.68it/s]
 37%|███▋      | 89/243 [00:01<00:02, 54.97it/s]
 40%|███▉      | 97/243 [00:01<00:02, 55.90it/s]
 43%|████▎     | 104/243 [00:01<00:02, 58.89it/s]
 45%|████▌     | 110/243 [00:02<00:02, 54.88it/s]
 48%|████▊     | 117/243 [00:02<00:02, 53.78it/s]
 51%|█████     | 124/243 [00:02<00:02, 57.17it/s]
 53%|█████▎    | 130/243 [00:02<00:02, 54.82it/s]
 56%|█████▋    | 137/243 [00:02<00:01, 53.88it/s]
 59%|█████▉    | 144/243 [00:02<00:01, 57.75it/s]
 62%|██████▏   | 150/243 [00:02<00:01, 56.74it/s]
 64%|██████▍   | 156/243 [00:02<00:01, 56.36it/s]
 67%|██████▋   | 162/243 [00:03<00:01, 55.18it/s]
 69%|██████▉   | 168/243 [00:03<00:01, 56.16it/s]
 72%|███████▏  | 174/243 [00:03<00:01, 56.10it/s]
 74%|███████▍  | 180/243 [00:03<00:01, 54.13it/s]
 77%|███████▋  | 187/243 [00:03<00:01, 53.68it/s]
 80%|███████▉  | 194/243 [00:03<00:00, 57.46it/s]
 82%|████████▏ | 200/243 [00:03<00:00, 54.66it/s]
 85%|████████▍ | 206/243 [00:03<00:00, 52.44it/s]
 88%|████████▊ | 213/243 [00:03<00:00, 56.47it/s]
 90%|█████████ | 219/243 [00:04<00:00, 53.86it/s]
 93%|█████████▎| 226/243 [00:04<00:00, 53.27it/s]
 96%|█████████▌| 233/243 [00:04<00:00, 56.94it/s]
 98%|█████████▊| 239/243 [00:04<00:00, 54.78it/s]
100%|██████████| 243/243 [00:04<00:00, 53.97it/s]
(31087, 80)
INFO:test:                  mean    std
loss          0.012427  0.001
mae           52.57902    NaN
accuracy      0.028855    NaN
mae_d           55.203    NaN
accuracy_l    0.007945    NaN
accuracy_l30  0.011034    NaN

Model inference¶

# Load the model 1
avped_model = PED_CRNN(task='ideloc_ori_cloth.pth').to(device)
# avped_model.load_state_dict(torch.load('path to weights',weights_only=True)['models']['model'])

# Load the model 2
avped_model = load_ped_det_model('ped_crnn',pretrained=True,task='ideloc_ori_cloth').to(device)

# Model prediction 1
data_dict_tensor = {k: torch.Tensor(v).to(device).unsqueeze(0).float() for k, v in data_dict.items()}
output = avped_model(data_dict_tensor).squeeze(0).detach().cpu().numpy()
#print("The predicted person id is: {}, the ground truth is: {}".format(np.argmax(output[:40]),int(label[0])))
#print("The predicted angle is: {}, the ground truth is: {}".format(output[-1],label[1]))

# Model prediction 2
from pysensing.acoustic.inference.predict import *
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
predicted_result  = ped_det_predict(data_dict,'AFPILD',avped_model, device=device)
predicted_result = predicted_result.cpu().detach().numpy()
print("The predicted person id is: {}, the ground truth is: {}".format(np.argmax(output[:40]),int(label[0])))
print("The predicted angle is: {}, the ground truth is: {}".format(output[-1],label[1]))

The predicted person id is: 0, the ground truth is: 0
The predicted angle is: 0.503690242767334, the ground truth is: 0.5166419512569265

Model embedding¶

from pysensing.acoustic.inference.embedding import *
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
sample_embedding = ped_det_embedding(data_dict,'AFPILD',avped_model, device=device)

And that’s it. We’re done with our acoustic humna pose estimation tutorials. Thanks for reading.

Total running time of the script: (1 minutes 43.586 seconds)

Gallery generated by Sphinx-Gallery

Acoustic Pedestrian Detection Tutorial¶

AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness¶

Load the data¶

Load model¶

Modle Training and Testing¶

Modle Inference¶

Modle Embedding¶

AFPILD: Acoustic footstep dataset collected using one microphone array and LiDAR sensor for person identification and localization¶

Load the data¶

Model training¶

Model testing¶

Model inference¶

Model embedding¶

Docs

Tutorials

Get Started