• conditional_soup@lemm.ee
    link
    fedilink
    arrow-up
    160
    arrow-down
    1
    ·
    2 days ago

    The client wants to drag and drop their own personalized excel file with no guaranteed formatting or column order or data contract in order to import their data into our system <3

    • veroxii@aussie.zone
      link
      fedilink
      arrow-up
      7
      arrow-down
      1
      ·
      2 days ago

      Strangely enough we actually solved this problem with AI a few months back. We upload the excel file to Gemini and have a prompt to extract the data we need in a specific json format. And it works surprisingly well.

      • conditional_soup@lemm.ee
        link
        fedilink
        arrow-up
        18
        ·
        2 days ago

        How well? Bet your life on it well, or “fewer hallucinations than we would have guessed” well? I’ve considered and toyed around with openAI models for logging supply room check offs in a JSON format and it went better than I hoped but worse than I needed.

        • veroxii@aussie.zone
          link
          fedilink
          arrow-up
          11
          ·
          edit-2
          2 days ago

          Really well. Temp turned down all the way, and Gemini has this new feature to run and execute code… Not function calling… It can write a small python script, run it and return the output.

          So our prompt explains the excel spreadsheet, then tell it exactly the format we need it in, and then tell it to use python and pandas to read in the CSV, clean it up and reshape it the way we need it to match what we expect and voila.

          So hallucinations are not really and issue with the data as it’s simply writing code which then deterministically processes and returns the data.

          Edit to add more info: basically Gemini can create and run a lambda function on the fly. And if you’re a coder you can really guide the prompt. Eg "load this into pandas. Then remove all the empty columns. Also remove the total rows. Now unpivot the data so the months are not columns but in separate rows with a column called month.

          You get the idea.

      • Echo Dot@feddit.uk
        link
        fedilink
        arrow-up
        4
        ·
        2 days ago

        It would still have to be in at least somewhat of a consistent format. Even a human would require that.

        If they’re just going to write the details however they feel on any particular day and then just expect someone or something to be able to interpret that they’re going to have a bad time.