Data Mining PDF documents; using data conversion to reduce analysis time

By |2022-06-15T00:37:27-04:00May 31st, 2017|Categories: Automation, e-Discovery, Forensic, Scripts, Tesseract|Tags: , , , , , |

Problem A month ago, we became aware of a way to harvest legal notifications from a government web-site. Link Here The web-server allows simple requests to be crafted in order to download PDF documents related to court proceedings. After a few hours, we had over 25,000 PDF documents available to analyze. Now the question becomes: What is the [...]