TL-DR; for stuff that is NOT from sonarrr/radrr (e.g. downloaded long time ago / gotten from friends, RSS feeds, whatever), is there a better way to find subs than downloading everything from manual DDL sites and trying everything until one works (matching english text and correctly synced)?

I am not currently using bazarr and I understand that it can catch anything from sonarr that is missing subs but that is not the use-case I need. I am still open to it but since most of the new stuff I get already has subs, I’m looking more at my stuff that is NOT coming from sonarr bc that’s where I have the most missing subs. thinking since there github say:

Be aware that Bazarr doesn’t scan disk to detect series and movies: It only takes care of the series and movies that are indexed in Sonarr and Radarr."

that most of my use-case is going to be manual searches. It also sounds like Bazarr uses same kind of DDL sites like opensubtitles and subscene that I am already using as its backend / source so curious if there is any advantage vs looking up old stuff on the sites directly.

And especially if there is some way to match existing files with the correct subs, even if the file/folder names no longer contain the release group (e.g. via duration or other mediainfo data or maybe even via checksums). I know vlc can do it for a single file… but since I have a LOT of stuff w missing subs, I’m looking for a way that I can do something similar from a bash script or some other bulk job without getting a bunch of unsynced subs.

  • ramenbellic
    link
    fedilink
    English
    arrow-up
    6
    ·
    7 months ago

    Running it through Subtitle Edit with WhisperX can help a lot for longer movies. It breaks the file into much smaller pieces and runs Whisper on them one by one before stitching the result back together.

    • BlackFlagsForever@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      7 months ago

      interesting, that actually sounds like an awesome idea for the OTA tv rips, cuz I doubt I would even be able to find anything that matches by duration on normal sub sites.

      I hadn’t heard of whisper gui / whisperx before but I see it has a github. do you know if that is cloud-based or something you can run entirely local? (wondering if it is cloud-based in case i need to allow it net access & also curious if it would eat a lot of bandwidth for roughly 2 seasons of broadcast tv shows aka somewhere around 30-35 hrs worth of audio)

      edit: apparently whisper can be run entirely offline according to this so if whisperx is a fork, then i assume it would allow this too

    • SchizoDenji@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      7 months ago

      Does it work well with movies in other languages? I assume due to BGM it might cause errors?

      • ramenbellic
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        My limited experience has been positive w/ non-English languages.