"English-learning students’ scores on a state test designed to measure their mastery of the language fell sharply and have stayed low since 2018 — a drop that bilingual educators say might have less to do with students’ skills and more with sweeping design changes and the automated computer scoring system that were introduced that year.

English learners who used to speak to a teacher at their school as part of the Texas English Language Proficiency Assessment System now sit in front of a computer and respond to prompts through a microphone. The Texas Education Agency uses software programmed to recognize and evaluate students’ speech.

Students’ scores dropped after the new test was introduced, a Texas Tribune analysis shows. In the previous four years, about half of all students in grades 4-12 who took the test got the highest score on the test’s speaking portion, which was required to be considered fully fluent in English. Since 2018, only about 10% of test takers have gotten the top score in speaking each year."

  • ArbitraryValue@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    5
    ·
    edit-2
    3 months ago

    I suspect that the human graders were the biased ones, and that this automated test is more accurate. Schools frequently inflate test results when given the opportunity (especially when low results reflect poorly on the school).

    How do students known to be fluent in English do on it? Do they pass reliably?

    Edit: Here’s a discussion of a similar phenomenon in the context of high-school graduation rates. Graduation rates regularly go up by a very large amount when standardized tests stop being required, but that’s not because otherwise-qualified students were doing poorly on standardized tests.

    • michaelmrose@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      3 months ago

      It’s possible for both things to be true. Human reviewers might be biased towards awarding higher scores and the computer could be dog shit at scoring. I have no idea how this can meaningfully be grading fluency. Fluency in a spoken language consists of vocabulary, grammar, and pronunciation.

      I have seen plenty of people who were very fluent who speak with an extremely noticeable accent who were none the less comprehensible. Software is extremely likely to perform poorly at recognizing speak by non-native speakers and fail individuals who are otherwise comprehensible. Because it wont even recognize the words its nearly entirely testing pronunciation and then denying such students access to electives that would allow them to further their education.

      • ArbitraryValue@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        2
        ·
        3 months ago

        It’s quite possible that you’re right. I haven’t been able to find any research that attempts to quantify how accurate the software is, and without that I can only speculate.

    • 31337@sh.itjust.works
      link
      fedilink
      arrow-up
      9
      ·
      3 months ago

      If I understand the article correctly, the system is doing some kind of AI speech recognition to score how people speak. It’s not a natural environment for people to talk to a computer, and could easily be biased by accents. I doubt any automated scoring that isn’t just multiple choice is accurate.

      • ArbitraryValue@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        3
        ·
        edit-2
        3 months ago

        According to my own experience as a fluent English speaker who has a strong accent, modern voice-recognition systems have no problem with my accent, but I agree that they have flaws. They’re not perfect, but I expect that they’re more accurate than teachers because teachers have motives other than accuracy.

        Several districts have sued TEA to block the release of the last two years of ratings, arguing that recent changes to the metrics made it harder to get a good rating and could make them more susceptible to state intervention.

        • Sacreblew@lemmy.ca
          link
          fedilink
          arrow-up
          7
          ·
          3 months ago

          My wife and her family have a hell of a time getting Google to understand their requests (Hispanic, wife is first generation) and has no issue understanding my requests, so I could see significant issues with the software misinterpreting.

          • ArbitraryValue@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            2
            ·
            edit-2
            3 months ago

            Interesting. A few people have told me that I enunciate more clearly than a native speaker, so if that’s the case then my experience with speech-recognition systems will not be representative. With that said, older speech recognition systems did have trouble understanding me whereas newer ones don’t so I think there really has been improvement.

            I tried to find data about how students fluent in English do on this test but I wasn’t able to. Comparing native English speakers to native Spanish speakers who have already learned English would be informative.

          • shalafi@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            3 months ago

            Huh. My wife’s Filipino accent is pretty heavy and Google almost always understands her.