¿Cómo se editan las pausas en los videos de YouTube?

9

He notado en algunos videos de YouTube que se han cortado las pausas en el habla. No hay pausas, y esto hace que el video sea más fácil de ver.

¿Cómo se hace esto? He buscado esto, pero solo he encontrado tutoriales para hacerlo manualmente, ¡pero espero que haya una forma automática para los videos de YouTube, espero!

Aquí hay un ejemplo de uno de estos videos:

youtube

— AJ Henderson
fuente

2

He estado haciendo videos de YouTube desde 2006, y eliminé las pausas en mis videos. ¡Nunca he oído hablar de una forma automática de hacerlo! Sin embargo, su software de edición de video (como iMovie o Premiere) puede mostrar la pista de audio como una onda de sonido visible en la pantalla. Busco los puntos cuando está "en silencio", y los corto. Sí, es un proceso manual, pero es más fácil y rápido mirar la onda de sonido en la pista de audio que escuchar el inicio y el final de todo lo que dije en el video.

— BrettFromLA

Tampoco sé de una forma automática de hacer esto. Tenga en cuenta que cualquier método automático, si existe, no podrá distinguir una pausa a mitad de una pausa entre, digamos, dos oraciones. Por lo tanto, tendrá que escanear el resultado manualmente, lo que frustra el propósito.

— Gyan

@BrettFromLA Hay una manera, lo leí antes, simplemente no puedo recordar el programa.

Tengo esta misma pregunta, y está relacionada con estos otros 2: superuser.com/questions/990357/…

— Ryan

5

El efecto del que estás hablando se llama jumpcut y es particularmente popular entre los vloggers.
Hasta donde yo sé, no hay una forma automática de hacerlo, aunque probablemente sería posible hacerlo desde una perspectiva tecnológica. Sin embargo, usar jumpcuts tiene varias funciones.

Te deshaces del silencio y exprimes la mayor cantidad de información posible, manteniendo la atención de los espectadores.
La segunda función importante es la selección de tus disparos. Mientras revisa su material (toma generalmente en una toma) puede descartar cosas que no desea (incluso si estaba hablando en ese momento) y alterar fundamentalmente el tono de su video.

Vale la pena este mayor control creativo teniendo en cuenta que editar su video manualmente de esta manera es muy fácil y no tomará mucho tiempo o esfuerzo.

tl; dr
Puede ser factible, pero ahora no hay una solución fácil de usar. De todos modos, sugeriría hacerlo manualmente por varias razones descritas en la respuesta completa.

— Delltar
fuente

4

Estas perdiendo el punto. No está cortando el silencio, está manteniendo un buen ritmo agitado, lo que hace que sea difícil apagarlo mientras mira.

Hecho manualmente ¿Por qué?

Contar una historia, por lo que debes saber qué partes son relevantes y qué partes no lo son
A veces, aunque hay partes silenciosas, están sucediendo otras cosas visuales, como gestos salvajes, caras divertidas, una pausa para un efecto dramático.
Algunas veces sus cortes NO son cronológicos en orden. Para mejorar una historia, puede elegir clips o agregar clips de una línea de tiempo o video completamente diferente.

El problema principal que encontrará son los 'clics' que hace aquí cuando realiza los cortes de salto. Normalmente, desvincula el audio y el video y SOLO hace la transición de la parte de audio.

Espero que ayude. Feliz recorte

— eLouai
fuente

2

En Avid ProTools cuando está trabajando en un video con una pista de audio, hay algunas teclas de acceso rápido para seleccionar automáticamente todas las partes de audio donde la forma de onda llega a cero o cerca de ella.

Extender la selección también a la pista de video y cortarla también debería ser el truco, luego solo tiene que usar otra tecla de acceso rápido para unir las piezas cortadas sin pausas en el medio.

Creo que este flujo de trabajo podría aplicarse en casi todos los NLE (editores no lineales) disponibles.

— usuario3450548
fuente

1

Puede usar Laconia Trim para iOS o LaconiaTrimVideo.com para recortar en línea

Ambos recortes silencian en video automáticamente.

— niknamelogin
fuente

1

Cuando eliminas el silencio (tiempo para respirar, responder, pausa incómoda), por cada hora de espectáculo, hay 10 minutos de aire muerto. El sonido de nada.

La atención ya es difícil ... hablas a 120 WPM. Su cerebro puede manejar 400. En el espacio vacío es donde su cerebro se distrae.

Utilizo software de escritorio para MAC o PC para eliminar automáticamente el aire muerto / silencio de los archivos MP4 y MP3. Mire un gráfico, establezca el nivel de audio y el tiempo parcial, pulse ejecutar. Los cortes son automáticos y se vuelven a compilar en un solo archivo MP4 o MP3. Ver TimeBolt.io

Descargo de responsabilidad Soy dueño de esta herramienta.

— Doug Wulff
fuente

Producto interesante, felicidades.

— Rafael

Gracias @Rafael

— Doug Wulff

0

https://github.com/carykh/jumpcutter (licencia MIT) elimina automáticamente partes de un video que tienen poco o nada de audio. Se basa en ffmpeg y la tubería está codificada en Python 3.

Explicación:

Guión (licencia MIT, autor: carykh ):

from contextlib import closing
from PIL import Image
import subprocess
from audiotsm import phasevocoder
from audiotsm.io.wav import WavReader, WavWriter
from scipy.io import wavfile
import numpy as np
import re
import math
from shutil import copyfile, rmtree
import os
import argparse
from pytube import YouTube

def downloadFile(url):
    name = YouTube(url).streams.first().download()
    newname = name.replace(' ','_')
    os.rename(name,newname)
    return newname

def getMaxVolume(s):
    maxv = float(np.max(s))
    minv = float(np.min(s))
    return max(maxv,-minv)

def copyFrame(inputFrame,outputFrame):
    src = TEMP_FOLDER+"/frame{:06d}".format(inputFrame+1)+".jpg"
    dst = TEMP_FOLDER+"/newFrame{:06d}".format(outputFrame+1)+".jpg"
    if not os.path.isfile(src):
        return False
    copyfile(src, dst)
    if outputFrame%20 == 19:
        print(str(outputFrame+1)+" time-altered frames saved.")
    return True

def inputToOutputFilename(filename):
    dotIndex = filename.rfind(".")
    return filename[:dotIndex]+"_ALTERED"+filename[dotIndex:]

def createPath(s):
    #assert (not os.path.exists(s)), "The filepath "+s+" already exists. Don't want to overwrite it. Aborting."

    try:  
        os.mkdir(s)
    except OSError:  
        assert False, "Creation of the directory %s failed. (The TEMP folder may already exist. Delete or rename it, and try again.)"

def deletePath(s): # Dangerous! Watch out!
    try:  
        rmtree(s,ignore_errors=False)
    except OSError:  
        print ("Deletion of the directory %s failed" % s)
        print(OSError)

parser = argparse.ArgumentParser(description='Modifies a video file to play at different speeds when there is sound vs. silence.')
parser.add_argument('--input_file', type=str,  help='the video file you want modified')
parser.add_argument('--url', type=str, help='A youtube url to download and process')
parser.add_argument('--output_file', type=str, default="", help="the output file. (optional. if not included, it'll just modify the input file name)")
parser.add_argument('--silent_threshold', type=float, default=0.03, help="the volume amount that frames' audio needs to surpass to be consider \"sounded\". It ranges from 0 (silence) to 1 (max volume)")
parser.add_argument('--sounded_speed', type=float, default=1.00, help="the speed that sounded (spoken) frames should be played at. Typically 1.")
parser.add_argument('--silent_speed', type=float, default=5.00, help="the speed that silent frames should be played at. 999999 for jumpcutting.")
parser.add_argument('--frame_margin', type=float, default=1, help="some silent frames adjacent to sounded frames are included to provide context. How many frames on either the side of speech should be included? That's this variable.")
parser.add_argument('--sample_rate', type=float, default=44100, help="sample rate of the input and output videos")
parser.add_argument('--frame_rate', type=float, default=30, help="frame rate of the input and output videos. optional... I try to find it out myself, but it doesn't always work.")
parser.add_argument('--frame_quality', type=int, default=3, help="quality of frames to be extracted from input video. 1 is highest, 31 is lowest, 3 is the default.")

args = parser.parse_args()



frameRate = args.frame_rate
SAMPLE_RATE = args.sample_rate
SILENT_THRESHOLD = args.silent_threshold
FRAME_SPREADAGE = args.frame_margin
NEW_SPEED = [args.silent_speed, args.sounded_speed]
if args.url != None:
    INPUT_FILE = downloadFile(args.url)
else:
    INPUT_FILE = args.input_file
URL = args.url
FRAME_QUALITY = args.frame_quality

assert INPUT_FILE != None , "why u put no input file, that dum"

if len(args.output_file) >= 1:
    OUTPUT_FILE = args.output_file
else:
    OUTPUT_FILE = inputToOutputFilename(INPUT_FILE)

TEMP_FOLDER = "TEMP"
AUDIO_FADE_ENVELOPE_SIZE = 400 # smooth out transitiion's audio by quickly fading in/out (arbitrary magic number whatever)

createPath(TEMP_FOLDER)

command = "ffmpeg -i "+INPUT_FILE+" -qscale:v "+str(FRAME_QUALITY)+" "+TEMP_FOLDER+"/frame%06d.jpg -hide_banner"
subprocess.call(command, shell=True)

command = "ffmpeg -i "+INPUT_FILE+" -ab 160k -ac 2 -ar "+str(SAMPLE_RATE)+" -vn "+TEMP_FOLDER+"/audio.wav"

subprocess.call(command, shell=True)

command = "ffmpeg -i "+TEMP_FOLDER+"/input.mp4 2>&1"
f = open(TEMP_FOLDER+"/params.txt", "w")
subprocess.call(command, shell=True, stdout=f)



sampleRate, audioData = wavfile.read(TEMP_FOLDER+"/audio.wav")
audioSampleCount = audioData.shape[0]
maxAudioVolume = getMaxVolume(audioData)

f = open(TEMP_FOLDER+"/params.txt", 'r+')
pre_params = f.read()
f.close()
params = pre_params.split('\n')
for line in params:
    m = re.search('Stream #.*Video.* ([0-9]*) fps',line)
    if m is not None:
        frameRate = float(m.group(1))

samplesPerFrame = sampleRate/frameRate

audioFrameCount = int(math.ceil(audioSampleCount/samplesPerFrame))

hasLoudAudio = np.zeros((audioFrameCount))



for i in range(audioFrameCount):
    start = int(i*samplesPerFrame)
    end = min(int((i+1)*samplesPerFrame),audioSampleCount)
    audiochunks = audioData[start:end]
    maxchunksVolume = float(getMaxVolume(audiochunks))/maxAudioVolume
    if maxchunksVolume >= SILENT_THRESHOLD:
        hasLoudAudio[i] = 1

chunks = [[0,0,0]]
shouldIncludeFrame = np.zeros((audioFrameCount))
for i in range(audioFrameCount):
    start = int(max(0,i-FRAME_SPREADAGE))
    end = int(min(audioFrameCount,i+1+FRAME_SPREADAGE))
    shouldIncludeFrame[i] = np.max(hasLoudAudio[start:end])
    if (i >= 1 and shouldIncludeFrame[i] != shouldIncludeFrame[i-1]): # Did we flip?
        chunks.append([chunks[-1][1],i,shouldIncludeFrame[i-1]])

chunks.append([chunks[-1][1],audioFrameCount,shouldIncludeFrame[i-1]])
chunks = chunks[1:]

outputAudioData = np.zeros((0,audioData.shape[1]))
outputPointer = 0

lastExistingFrame = None
for chunk in chunks:
    audioChunk = audioData[int(chunk[0]*samplesPerFrame):int(chunk[1]*samplesPerFrame)]

    sFile = TEMP_FOLDER+"/tempStart.wav"
    eFile = TEMP_FOLDER+"/tempEnd.wav"
    wavfile.write(sFile,SAMPLE_RATE,audioChunk)
    with WavReader(sFile) as reader:
        with WavWriter(eFile, reader.channels, reader.samplerate) as writer:
            tsm = phasevocoder(reader.channels, speed=NEW_SPEED[int(chunk[2])])
            tsm.run(reader, writer)
    _, alteredAudioData = wavfile.read(eFile)
    leng = alteredAudioData.shape[0]
    endPointer = outputPointer+leng
    outputAudioData = np.concatenate((outputAudioData,alteredAudioData/maxAudioVolume))

    #outputAudioData[outputPointer:endPointer] = alteredAudioData/maxAudioVolume

    # smooth out transitiion's audio by quickly fading in/out

    if leng < AUDIO_FADE_ENVELOPE_SIZE:
        outputAudioData[outputPointer:endPointer] = 0 # audio is less than 0.01 sec, let's just remove it.
    else:
        premask = np.arange(AUDIO_FADE_ENVELOPE_SIZE)/AUDIO_FADE_ENVELOPE_SIZE
        mask = np.repeat(premask[:, np.newaxis],2,axis=1) # make the fade-envelope mask stereo
        outputAudioData[outputPointer:outputPointer+AUDIO_FADE_ENVELOPE_SIZE] *= mask
        outputAudioData[endPointer-AUDIO_FADE_ENVELOPE_SIZE:endPointer] *= 1-mask

    startOutputFrame = int(math.ceil(outputPointer/samplesPerFrame))
    endOutputFrame = int(math.ceil(endPointer/samplesPerFrame))
    for outputFrame in range(startOutputFrame, endOutputFrame):
        inputFrame = int(chunk[0]+NEW_SPEED[int(chunk[2])]*(outputFrame-startOutputFrame))
        didItWork = copyFrame(inputFrame,outputFrame)
        if didItWork:
            lastExistingFrame = inputFrame
        else:
            copyFrame(lastExistingFrame,outputFrame)

    outputPointer = endPointer

wavfile.write(TEMP_FOLDER+"/audioNew.wav",SAMPLE_RATE,outputAudioData)

'''
outputFrame = math.ceil(outputPointer/samplesPerFrame)
for endGap in range(outputFrame,audioFrameCount):
    copyFrame(int(audioSampleCount/samplesPerFrame)-1,endGap)
'''

command = "ffmpeg -framerate "+str(frameRate)+" -i "+TEMP_FOLDER+"/newFrame%06d.jpg -i "+TEMP_FOLDER+"/audioNew.wav -strict -2 "+OUTPUT_FILE
subprocess.call(command, shell=True)

deletePath(TEMP_FOLDER)

— Franck Dernoncourt
fuente

0

Puede usar la secuencia de comandos de usuario Pausa incómoda para Adobe After Effects. Costo del script de usuario: 59.99 USD

Captura de pantalla:

Manifestación:

Descargo de responsabilidad: en el momento en que escribo esta publicación (2019-06-08), trabajo para Adobe.

— Franck Dernoncourt
fuente