S2I Translation

Implementation of a sound-to-image (S2I) translation system using PyTorch

Sound-to-Imagination: Unsupervised Crossmodal Translation Using Deep Dense Network Architecture
Leonardo A. Fanzeres, Climent Nadeu
International Journal of Computer Vision (UNDER REVIEW), 2021 | Arxiv

The motivation of our research is to develop a sound-to-image (S2I) translation system for enabling a human receiver to visually infer the occurrence of sound related events. We expect the computer to ‘imagine’ the scene from the captured sound, generating original images that picture the sound emitting source.

Setup

Requirements (tested versions)

csv (1.0)
matplotlib (2.2.2 to 3.1.1)
numpy (1.14.2 to 1.17.2)
python (3.5.2 to 3.7.4)
scipy (1.0.1 to 1.3.1)
torch (1.1.0)
torchvision (0.3.0)
Can be executed in CPU mode, but it is recommended to run in GPU with cuda (9.0.176) + CuDNN

Get Started

Install Pytorch and the other required packages listed above
Clone or download this repository
Download data binary files from https://github.com/leofanzeres/s2i_data.git
Execute a quantitative test using the interpretability classifiers
Execute a qualitative test generating the translated images
Train the autoencoder model from scratch executing actions/train_net_audio_autoencoder.py
Train the visual generator model from scratch and report the achieved interpretability executing ... (to be made available)

Acknowledgments

The present work was supported in part by the Brazilian National Council for Scientific and Technological Development (CNPq) under the PhD grant 200884/2015-8. Also, the work was partly supported by the Spanish State Research Agency (AEI) project PID2019-107579RB-I00/AEI/10.13039/501100011033. Furthermore, the authors are thankful to Santiago Pascual for his advice on the implementation of GANs. We also thank Josep Pujal for his support in using the computational resources of the Signal Theory and Communications Department at the Polytechnic University of Catalonia (UPC).

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
actions		actions
datasets		datasets
images		images
models		models
trained_models		trained_models
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
utils.py		utils.py
values.py		values.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

actions

actions

datasets

datasets

images

images

models

models

trained_models

trained_models

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

utils.py

utils.py

values.py

values.py

Repository files navigation

S2I Translation

Setup

Requirements (tested versions)

Get Started

Acknowledgments

About

Releases

Packages

Languages

License

leofanzeres/s2i

Folders and files

Latest commit

History

Repository files navigation

S2I Translation

Setup

Requirements (tested versions)

Get Started

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Languages