| Content Extraction Time |
Manual processing of each book page, several hours to days per subject |
Automated extraction; completed in a few days per subject |
| Media (Images & YouTube URLs) Handling |
Manual download and embedding, high chance of broken links |
Automatic extraction, download, and embedding with accurate paths and folder mapping |
| HTML Standardization |
Highly inconsistent, repetitive code manually edited across files |
Uniform, fully standardized HTML structure via automated DOM and Jsoup transformations |
| Manual Errors |
Frequent errors in HTML formatting, missing media, and incorrect folder structure |
Human error has been nearly eliminated, consistent and reliable output |
| Folder Structuring |
Manual creation of subject/module folders, prone to mistakes |
Automated module-wise and subject-wise folder generation using Java NIO |
| Team Productivity |
High manual workload, time spent on repetitive tasks |
Significant time savings, team focuses on high-value tasks |
| Scalability |
Difficult to scale, effort increased linearly with more subjects/modules |
Highly scalable, new subjects/modules processed with minimal changes |
| Reusability of Code/Framework |
No reusable components, work repeated for each set of pages |
Modular automation scripts reused across all subjects and future projects |