Email to Chronological Markdown Converter
Overview
This script processes email files (.eml) into a chronological narrative in Markdown format, optimized for processing with Large Language Models (LLMs). It extracts essential information from emails while removing unnecessary metadata, creating a clean, temporal narrative that can be easily analyzed.
Purpose
The main goal is to convert email threads and nested communications into a simple, chronological text format that LLMs can effectively process to:
- Analyze communication patterns
- Extract key information
- Track project development
- Identify important decisions and their timeline
- Understand relationships and interactions between participants
Core Functionality
Input Processing
- Reads .eml files from the current directory
- Handles nested .eml files found as attachments
- Supports multiple languages (English, Italian)
- Extracts sender name and timestamp as primary identifiers
Content Management
- Maintains a single chronological file (cronologia.md)
- Removes redundant metadata and formatting
- Preserves essential content in plain text
- Converts tables to markdown format
- Links to non-email attachments
- Eliminates duplicate entries
Format Structure
## YYYYMMDDhhmmss|Sender Name
Message content in plain text...
Tables converted to markdown format...
### Attachments
- [[document1.pdf]]
- [[image1.jpg]]
---
Key Features
-
Chronological Organization
- All messages sorted by date
- Consistent timestamp format
- Clear sender identification
-
Content Cleaning
- Removes email headers
- Eliminates signatures
- Strips formatting
- Preserves table structure
-
Attachment Handling
- Creates 'attachments' folder
- Maintains file references
- Processes nested emails
- Prevents duplicates
LLM Processing Considerations
The output is structured to facilitate:
- Temporal analysis
- Relationship mapping
- Topic tracking
- Decision point identification
- Project timeline reconstruction
The consistent formatting and cleaned content allow LLMs to focus on:
- Message content analysis
- Temporal relationships
- Communication patterns
- Project development tracking
- Key information extraction
This format enables LLMs to effectively process email communications while maintaining the contextual and temporal relationships essential for understanding the narrative flow of information.