Title text: The heartfelt tune it plays is CC licensed, and you can get it from my seed on JoinDiaspora.net whenever that project gets going.


Transcript

2003:

[Cueball approaches a bearded fellow.]

Cueball: Did you get my essay?
Bearded Fellow: Yeah, it was good! But it was a .doc; You should really use a more open-
Cueball: Give it a rest already. Maybe we just want to live our lives and use software that works, not get wrapped up in your stupid nerd turf wars.
Bearded Fellow: I just want people to care about the infrastructures we’re building and who-
Cueball: No, you just want to feel smugly superior. You have no sense of perspective and are probably autistic.

2010:

Cueball: Oh my God! We handed control of our social world to Facebook and they’re DOING EVIL STUFF!
Bearded Fellow: Do you see this?

[Inset, the bearded fellow rubs his index and middle fingers against his thumb.]

Bearded Fellow: It’s the world’s tiniest open-source violin.


  • oatscoop
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 months ago

    The OCR struggles with some PDFs for whatever reasons: font, formatting, etc.

    There are 3rd party PDF OCR websites/programs that work better. If I’m having issues I run it through one of those first.

    • greenskye@lemm.ee
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      Any suggestions? Even the good ones had error rates that might not matter for a couple of pages, but when scaled to a 500 page book, even a 1% error rate results in an annoying level of typos.

      • oatscoop
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        I use gImageReader + Tesseract, but that probably doesn’t meet your criteria. Unfortunately OCR is very rarely perfect unless the input is perfectly clear and with a “OCR friendly” font/formatting. There are “AI powered” OCR out there, but I can’t speak to how well they work and I don’t know of any free ones.