Actualizar Home
parent
13861e7613
commit
a26538036e
78
Home.md
78
Home.md
|
@ -1 +1,77 @@
|
||||||
# Email to Chronological Markdown Converter ## Overview This script processes email files (.eml) into a chronological narrative in Markdown format, optimized for processing with Large Language Models (LLMs). It extracts essential information from emails while removing unnecessary metadata, creating a clean, temporal narrative that can be easily analyzed. ## Purpose The main goal is to convert email threads and nested communications into a simple, chronological text format that LLMs can effectively process to: - Analyze communication patterns - Extract key information - Track project development - Identify important decisions and their timeline - Understand relationships and interactions between participants ## Core Functionality ### Input Processing - Reads .eml files from the current directory - Handles nested .eml files found as attachments - Supports multiple languages (English, Italian) - Extracts sender name and timestamp as primary identifiers ### Content Management - Maintains a single chronological file (cronologia.md) - Removes redundant metadata and formatting - Preserves essential content in plain text - Converts tables to markdown format - Links to non-email attachments - Eliminates duplicate entries ### Format Structure ```markdown ## YYYYMMDDhhmmss|Sender Name Message content in plain text... Tables converted to markdown format... ### Attachments - [[document1.pdf]] - [[image1.jpg]] --- ``` ### Key Features 1. Chronological Organization - All messages sorted by date - Consistent timestamp format - Clear sender identification 2. Content Cleaning - Removes email headers - Eliminates signatures - Strips formatting - Preserves table structure 3. Attachment Handling - Creates 'attachments' folder - Maintains file references - Processes nested emails - Prevents duplicates ## LLM Processing Considerations The output is structured to facilitate: 1. Temporal analysis 2. Relationship mapping 3. Topic tracking 4. Decision point identification 5. Project timeline reconstruction The consistent formatting and cleaned content allow LLMs to focus on: - Message content analysis - Temporal relationships - Communication patterns - Project development tracking - Key information extraction This format enables LLMs to effectively process email communications while maintaining the contextual and temporal relationships essential for understanding the narrative flow of information.
|
# Email to Chronological Markdown Converter
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
This script processes email files (.eml) into a chronological narrative in Markdown format, optimized for processing with Large Language Models (LLMs). It extracts essential information from emails while removing unnecessary metadata, creating a clean, temporal narrative that can be easily analyzed.
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
The main goal is to convert email threads and nested communications into a simple, chronological text format that LLMs can effectively process to:
|
||||||
|
- Analyze communication patterns
|
||||||
|
- Extract key information
|
||||||
|
- Track project development
|
||||||
|
- Identify important decisions and their timeline
|
||||||
|
- Understand relationships and interactions between participants
|
||||||
|
|
||||||
|
## Core Functionality
|
||||||
|
|
||||||
|
### Input Processing
|
||||||
|
- Reads .eml files from the current directory
|
||||||
|
- Handles nested .eml files found as attachments
|
||||||
|
- Supports multiple languages (English, Italian)
|
||||||
|
- Extracts sender name and timestamp as primary identifiers
|
||||||
|
|
||||||
|
### Content Management
|
||||||
|
- Maintains a single chronological file (cronologia.md)
|
||||||
|
- Removes redundant metadata and formatting
|
||||||
|
- Preserves essential content in plain text
|
||||||
|
- Converts tables to markdown format
|
||||||
|
- Links to non-email attachments
|
||||||
|
- Eliminates duplicate entries
|
||||||
|
|
||||||
|
### Format Structure
|
||||||
|
```markdown
|
||||||
|
## YYYYMMDDhhmmss|Sender Name
|
||||||
|
|
||||||
|
Message content in plain text...
|
||||||
|
Tables converted to markdown format...
|
||||||
|
|
||||||
|
### Attachments
|
||||||
|
- [[document1.pdf]]
|
||||||
|
- [[image1.jpg]]
|
||||||
|
|
||||||
|
---
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Features
|
||||||
|
1. Chronological Organization
|
||||||
|
- All messages sorted by date
|
||||||
|
- Consistent timestamp format
|
||||||
|
- Clear sender identification
|
||||||
|
|
||||||
|
2. Content Cleaning
|
||||||
|
- Removes email headers
|
||||||
|
- Eliminates signatures
|
||||||
|
- Strips formatting
|
||||||
|
- Preserves table structure
|
||||||
|
|
||||||
|
3. Attachment Handling
|
||||||
|
- Creates 'attachments' folder
|
||||||
|
- Maintains file references
|
||||||
|
- Processes nested emails
|
||||||
|
- Prevents duplicates
|
||||||
|
|
||||||
|
## LLM Processing Considerations
|
||||||
|
The output is structured to facilitate:
|
||||||
|
1. Temporal analysis
|
||||||
|
2. Relationship mapping
|
||||||
|
3. Topic tracking
|
||||||
|
4. Decision point identification
|
||||||
|
5. Project timeline reconstruction
|
||||||
|
|
||||||
|
The consistent formatting and cleaned content allow LLMs to focus on:
|
||||||
|
- Message content analysis
|
||||||
|
- Temporal relationships
|
||||||
|
- Communication patterns
|
||||||
|
- Project development tracking
|
||||||
|
- Key information extraction
|
||||||
|
|
||||||
|
This format enables LLMs to effectively process email communications while maintaining the contextual and temporal relationships essential for understanding the narrative flow of information.
|
Loading…
Reference in New Issue