Sound Snap – AI-Powered Audio-To-Midi Conversion Platform





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 107
File Size 312.54 KB
File Count 1
Create Date 03/03/2026
Last Updated 03/03/2026

Download

Description

“Sound Snap – AI-Powered Audio-To-Midi Conversion Platform”

Darshan Jibhau Thakare, Raj Bharat Nemade, Tanmay Prashant Mohan, Jayesh Suresh Patil, Ms. S. S. Pathare

1darshanthakare90@gmail.com, Student of Diploma of Engineering, IT, K. K. Wagh Polytechnic, Nashik, India

2rajnemade000@gmail.com, Student of Diploma of Engineering. IT, K. K. Wagh Polytechnic, Nashik, India

3tanmaymohan70@gmail.com, Student of Diploma of Engineering. IT, K. K. Wagh Polytechnic Nashik, India

4 8030jayeshpatil@gmail.com, Student of Diploma of Engineering. IT, K. K. Wagh Polytechnic, Nashik, India

5sspathare@kkwagh.edu.in, Student of Diploma of Engineering, IT, K. K. Wagh Polytechnic, Nashik, India

Abstract - Currently, audio-to-MIDI conversion is often performed using complex Digital Audio Workstations (DAWs) or offline transcription tools that require manual processing, advanced technical knowledge, and significant computing resources. While several software solutions provide audio transcription capabilities, many lack real-time feedback, web accessibility, and seamless playback integration. Additionally, traditional workflows involve multiple steps such as exporting audio, running conversion software separately, manually importing MIDI files into players, and configuring instrument packs. These processes are time-consuming, error-prone, and not user-friendly for beginners or musicians seeking quick results.

The proposed system, Audio to MIDI Studio, offers a modern and automated web-based solution for converting audio recordings into MIDI files using Spotify’s Basic Pitch model. The application enables users to record audio directly in the browser or upload existing audio files. The backend processes the audio using an asynchronous job pipeline built with FastAPI, where the audio is preprocessed, normalized, and transcribed into MIDI format.

The system provides real-time job status updates using WebSocket communication, ensuring a responsive user experience. Once transcription is complete, the generated MIDI file can be played directly in the browser using html-midi-player, with selectable SoundFont instrument packs. Users can also download the MIDI file for further editing or production use.

This automated workflow reduces manual effort, eliminates dependency on complex desktop tools, and provides a fast, secure, and efficient method for audio-to-MIDI transcription. By integrating modern web technologies with machine learning-based transcription, Audio to MIDI Studio delivers a production-ready, scalable, and user-friendly solution for musicians, educators, and developers.

Key Words: Audio to MIDI conversion, Basic Pitch, FastAPI, WebSocket, MIDI playback, real-time transcription, web-based music processing.

Sound Snap – AI-Powered Audio-To-Midi Conversion Platform

“Sound Snap – AI-Powered Audio-To-Midi Conversion Platform”

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

Sound Snap – AI-Powered Audio-To-Midi Conversion Platform

“Sound Snap – AI-Powered Audio-To-Midi Conversion Platform”

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us