AI Voice Cloning using Deep Learning





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 175
File Size 396.98 KB
File Count 1
Create Date 25/06/2025
Last Updated 25/06/2025

Download

Description

AI Voice Cloning using Deep Learning

Akshay Kumar1, Dr. Amandeep2, Ritu3

M.Sc. Computer Science1,3, Artificial Intelligence and Data Science, GJUS&T Hisar,

Assistant Professor2, Artificial Intelligence and Data Science, GJUS&T Hisar,
Email- akshay9068s@gmail.com

Abstract— In this project, we have worked on creating a voice cloning system using deep learning. The main idea was to build a model that can listen to one person's voice and then convert it into another person’s voice, in such a way that it sounds real and natural. We used the LibriSpeech dataset for training our model because it contains a large number of voice recordings from many different speakers, which helped us teach the model how various people speak.To process the audio, first we convert the voice into features like mel spectrograms and pitch (F0), which will help to capture the sound and style of someone’s voice. The captured features were then used to train a neural network that learns how to copy the target speaker’s voice style and apply it to a new voice. We used a multi-speaker training method so that the system doesn’t just work for one or two speakers, but can handle many different voices.

After training, we tested our model by giving it new voice samples and asking it to clone those voices into different speaker styles. The results were quite good. The converted voices sounded very close to the target speakers and were easy to understand.

We also checked the waveforms and did listening tests to compare the original and cloned voices. The output was smooth and clear, showing that the model was able to learn speaker characteristics effectively.

Overall, this project shows that voice cloning using deep learning is possible and can give good results even without a huge amount of data. It has many future uses like helping people who can’t speak, making virtual assistants more personal, or even dubbing videos in different voices. In future, we can try adding emotions or working on real- time voice conversion as well.

Keywords:Voice Cloning, Deep Learning, Mel Spectrogram, Speaker Conversion, Speech Synthesis, LibriSpeech.

AI Voice Cloning using Deep Learning

AI Voice Cloning using Deep Learning

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

AI Voice Cloning using Deep Learning

AI Voice Cloning using Deep Learning

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us