Implementación de un sistema de reconexión automática para el PLC con backoff exponencial, permitiendo la reanudación de streaming y grabación de datasets tras desconexiones. Se añadieron nuevos endpoints API para gestionar el estado de reconexión y habilitar/deshabilitar esta funcionalidad. Se actualizaron los archivos de configuración y estado del sistema, así como la interfaz de usuario para reflejar el estado de reconexión y mejorar la experiencia del usuario.

This commit is contained in:
Miguel 2025-08-04 18:26:22 +02:00
parent d1ca6f6ed6
commit df07451079
10 changed files with 1441 additions and 21 deletions

View File

@ -819,4 +819,207 @@ streaming.intervalId = setInterval(() => {
}, finalRefreshRateMs);
```
**Result**: El refresh rate ahora funciona correctamente tanto en modo streaming como fallback, con logs detallados para monitoreo.
**Result**: El refresh rate ahora funciona correctamente tanto en modo streaming como fallback, con logs detallados para monitoreo.
---
## Reconexión Automática con Backoff Exponencial
**Date**: 2025-01-08
**Request**: "Implementar reconexión automática cuando se pierde la comunicación con el PLC. Luego de perder la comunicación debería cerrar la comunicación y reintentar con backoff exponencial: 1s, 2s, 4s, 8s, 16s, 32s hasta llegar a 1 minuto máximo. Se seguirá reintentando cada 1 minuto."
**Problem**: El sistema actual no tenía capacidad de reconexión automática cuando se perdía la comunicación con el PLC. Los errores de timeout y connection abort se registraban pero no se intentaba reconectar automáticamente.
**Technical Implementation**:
### PLCClient Enhanced with Automatic Reconnection
```python
# New properties for reconnection management
self.reconnection_enabled = True
self.is_reconnecting = False
self.consecutive_failures = 0
self.max_backoff_seconds = 60 # Maximum 1 minute backoff
self.base_backoff_seconds = 1 # Start with 1 second
# Exponential backoff algorithm
def _calculate_backoff_delay(self) -> float:
if self.consecutive_failures == 0:
return 0
power = self.consecutive_failures - 1
delay = self.base_backoff_seconds * (2 ** power)
return min(delay, self.max_backoff_seconds)
# Connection error detection
def _is_connection_error(self, error_str: str) -> bool:
connection_error_indicators = [
'connection timed out', 'connection abort', 'recv tcp',
'send tcp', 'socket', 'timeout', 'connection refused',
'connection reset', 'broken pipe', 'network is unreachable'
]
error_lower = str(error_str).lower()
return any(indicator in error_lower for indicator in connection_error_indicators)
```
### Automatic Reconnection Process
- **Error Detection**: Errores de conexión se detectan automáticamente en `read_variable()`
- **Connection Close**: Se fuerza desconexión antes de cada intento de reconexión
- **Background Thread**: Reconexión ejecutada en hilo separado para no bloquear operaciones
- **Backoff Sequence**: 1s → 2s → 4s → 8s → 16s → 32s → 60s (max) → continúa cada 60s
- **Thread Safety**: Uso de locks para evitar múltiples intentos simultáneos
### API Integration
```python
# New API endpoints
/api/plc/reconnection/status # GET - Estado detallado de reconexión
/api/plc/reconnection/enable # POST - Habilitar reconexión automática
/api/plc/reconnection/disable # POST - Deshabilitar reconexión automática
# Enhanced status response
{
"plc_reconnection": {
"enabled": true,
"active": false,
"consecutive_failures": 0,
"next_delay_seconds": 0,
"max_backoff_seconds": 60
},
"plc_connection_info": {
"connected": true,
"reconnection_enabled": true,
"is_reconnecting": false,
"consecutive_failures": 0
}
}
```
### Frontend Integration
```javascript
// Dynamic status display with reconnection info
if (reconnectionInfo.active) {
statusText = '🔌 PLC: Reconnecting...';
statusClass = 'status-item status-reconnecting';
reconnectionDetails = `🔄 Next attempt in ${nextDelay}s (failure #${failures})`;
}
// New CSS class with animation
.status-reconnecting {
background: #ff9800;
color: white;
animation: pulse 2s infinite;
}
```
### Key Features Implemented
1. **Automatic Detection**: Identifica errores de conexión específicos del protocolo S7
2. **Exponential Backoff**: Secuencia 1s, 2s, 4s, 8s, 16s, 32s, 60s con máximo de 1 minuto
3. **Clean Reconnection**: Cierra completamente la conexión antes de cada intento
4. **Visual Feedback**: Estado visual animado durante reconexión en la interfaz web
5. **API Control**: Endpoints para habilitar/deshabilitar y monitorear reconexión
6. **Thread Safety**: Manejo seguro de hilos para evitar conflictos
7. **Graceful Shutdown**: Limpieza correcta de hilos al desconectar
**Result**: El sistema ahora se reconecta automáticamente cuando se pierde la comunicación con el PLC, usando backoff exponencial hasta 1 minuto máximo, con indicación visual en tiempo real del estado de reconexión.
### ⚡ Automatic Dataset Resume After Reconnection
**Problem**: Aunque la reconexión funcionaba, el streaming/recording no se reanudaba automáticamente después de una reconexión exitosa.
**Solution**: Sistema de callback y tracking de datasets activos:
```python
# Callback system in PLCClient
self.reconnection_callbacks = [] # List of callback functions
def _notify_reconnection_success(self):
"""Notify all registered callbacks about successful reconnection"""
for callback in self.reconnection_callbacks:
try:
callback()
except Exception as e:
if self.logger:
self.logger.error(f"Error in reconnection callback: {e}")
# DataStreamer registers for reconnection callbacks
self.plc_client.add_reconnection_callback(self._on_plc_reconnected)
# Track datasets before disconnection
def _track_disconnection(self):
"""Track which datasets were active before disconnection for auto-resume"""
self.datasets_before_disconnection = set()
for dataset_id in self.config_manager.active_datasets:
if (dataset_id in self.dataset_threads and
self.dataset_threads[dataset_id].is_alive()):
self.datasets_before_disconnection.add(dataset_id)
# Resume datasets after reconnection
def _on_plc_reconnected(self):
"""Callback method called when PLC reconnects successfully"""
datasets_to_resume = self.datasets_before_disconnection.copy()
resumed_count = 0
for dataset_id in datasets_to_resume:
if (dataset_id in self.config_manager.datasets and
dataset_id in self.config_manager.active_datasets):
thread_exists = dataset_id in self.dataset_threads
thread_alive = thread_exists and self.dataset_threads[dataset_id].is_alive()
if not thread_exists or not thread_alive:
self.start_dataset_streaming(dataset_id)
resumed_count += 1
```
### 🔄 Complete Reconnection Flow
1. **Connection Lost** → Error detectado en `read_variable()`
2. **Track Active Datasets** → Se guardan los datasets activos antes de perder conexión
3. **Dataset Loops End** → Los while loops terminan por `is_connected() = False`
4. **Background Reconnection** → Inicia proceso de backoff exponencial
5. **Successful Reconnection** → Se establece conexión y se notifica a callbacks
6. **Auto-Resume Streaming** → Se reinician automáticamente los dataset loops que estaban activos
7. **CSV Recording Resumed** → Se reanuda grabación automáticamente
8. **Visual Update** → Interfaz muestra estado reconectado
**Result**: El sistema ahora se reconecta Y reanuda automáticamente el streaming/recording de todos los datasets que estaban activos antes de la desconexión, proporcionando continuidad completa del servicio.
## 🚨 Bug Fix Crítico: Dead Thread Cleanup
### Problema Detectado
Durante testing se descubrió que aunque el tracking y auto-resume se ejecutaban correctamente (se veía en logs), **los nuevos threads no generaban datos**. El log mostraba:
- ✅ Tracking funcionando: "Tracked 1 active datasets for auto-resume: ['DAR']"
- ✅ Auto-resume ejecutándose: "Successfully resumed streaming for dataset 'DAR'"
- ❌ Sin datos después: No más logs de "Dataset 'DAR': CSV: 6 vars, UDP: 0 vars"
### Root Cause
Bug en `start_dataset_streaming()` en `core/streamer.py`:
```python
# CÓDIGO PROBLEMÁTICO - ANTES
if dataset_id in self.dataset_threads:
return True # Already running ← ❌ BUG!
```
**El problema**: Cuando el PLC se desconectaba, el thread moría pero **seguía en el dictionary**. Al intentar auto-resume, el método detectaba la entrada existente y retornaba `True` sin verificar si el thread estaba realmente vivo, **nunca creando el nuevo thread**.
### Solución Implementada
```python
# CÓDIGO CORREGIDO - DESPUÉS
if dataset_id in self.dataset_threads:
existing_thread = self.dataset_threads[dataset_id]
if existing_thread.is_alive():
return True # Already running and alive
else:
# Clean up dead thread
del self.dataset_threads[dataset_id]
# Continuar para crear nuevo thread...
```
### Key Changes
- **Verificación de vida**: Ahora comprueba `thread.is_alive()` antes de retornar
- **Limpieza automática**: Elimina threads muertos del dictionary automáticamente
- **Logging mejorado**: Registra limpieza de threads muertos para debugging
### Impacto
Este fix resuelve completamente el problema donde:
- La reconexión funcionaba ✅
- El tracking funcionaba ✅
- El auto-resume se ejecutaba ✅
- Pero **los datos no se reanudaban**
**Now**: Sistema completamente funcional con reconexión automática Y auto-resume de datos funcionando al 100%.

View File

@ -5291,8 +5291,631 @@
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T17:33:20.629769",
"level": "info",
"event_type": "application_started",
"message": "Application initialization completed successfully",
"details": {}
},
{
"timestamp": "2025-08-04T17:33:20.657761",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T17:33:20.665660",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T17:35:27.410648",
"level": "info",
"event_type": "csv_recording_stopped",
"message": "CSV recording stopped (dataset threads continue for UDP streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T17:35:27.417067",
"level": "info",
"event_type": "udp_streaming_stopped",
"message": "UDP streaming to PlotJuggler stopped (CSV recording continues)",
"details": {}
},
{
"timestamp": "2025-08-04T17:35:27.424853",
"level": "info",
"event_type": "dataset_deactivated",
"message": "Dataset deactivated: DAR",
"details": {
"dataset_id": "dar"
}
},
{
"timestamp": "2025-08-04T17:35:27.431787",
"level": "info",
"event_type": "plc_disconnection",
"message": "Disconnected from PLC 10.1.33.11 (stopped recording and streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T17:35:35.807507",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T17:35:35.815721",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T17:35:35.823158",
"level": "info",
"event_type": "plc_connection",
"message": "Successfully connected to PLC 10.1.33.11 and auto-started CSV recording for 1 datasets",
"details": {
"ip": "10.1.33.11",
"rack": 0,
"slot": 2,
"auto_started_recording": true,
"recording_datasets": 1,
"dataset_names": [
"DAR"
]
}
},
{
"timestamp": "2025-08-04T17:35:50.189381",
"level": "info",
"event_type": "application_started",
"message": "Application initialization completed successfully",
"details": {}
},
{
"timestamp": "2025-08-04T17:35:50.219390",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T17:35:50.227939",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T17:58:23.951096",
"level": "info",
"event_type": "application_started",
"message": "Application initialization completed successfully",
"details": {}
},
{
"timestamp": "2025-08-04T17:58:23.982922",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T17:58:23.990306",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T17:58:40.865069",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 0 datasets after PLC reconnection",
"details": {
"resumed_datasets": 0,
"total_attempted": 0
}
},
{
"timestamp": "2025-08-04T17:59:35.445558",
"level": "info",
"event_type": "csv_recording_stopped",
"message": "CSV recording stopped (dataset threads continue for UDP streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T17:59:35.452221",
"level": "info",
"event_type": "udp_streaming_stopped",
"message": "UDP streaming to PlotJuggler stopped (CSV recording continues)",
"details": {}
},
{
"timestamp": "2025-08-04T17:59:35.461395",
"level": "info",
"event_type": "dataset_deactivated",
"message": "Dataset deactivated: DAR",
"details": {
"dataset_id": "dar"
}
},
{
"timestamp": "2025-08-04T17:59:35.468817",
"level": "info",
"event_type": "plc_disconnection",
"message": "Disconnected from PLC 10.1.33.11 (stopped recording and streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T17:59:36.768298",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T17:59:36.775682",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T17:59:36.783499",
"level": "info",
"event_type": "plc_connection",
"message": "Successfully connected to PLC 10.1.33.11 and auto-started CSV recording for 1 datasets",
"details": {
"ip": "10.1.33.11",
"rack": 0,
"slot": 2,
"auto_started_recording": true,
"recording_datasets": 1,
"dataset_names": [
"DAR"
]
}
},
{
"timestamp": "2025-08-04T18:02:54.952571",
"level": "info",
"event_type": "application_started",
"message": "Application initialization completed successfully",
"details": {}
},
{
"timestamp": "2025-08-04T18:02:54.980909",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T18:02:54.987745",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T18:04:36.807205",
"level": "info",
"event_type": "csv_recording_stopped",
"message": "CSV recording stopped (dataset threads continue for UDP streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T18:04:36.817454",
"level": "info",
"event_type": "udp_streaming_stopped",
"message": "UDP streaming to PlotJuggler stopped (CSV recording continues)",
"details": {}
},
{
"timestamp": "2025-08-04T18:04:36.826272",
"level": "info",
"event_type": "dataset_deactivated",
"message": "Dataset deactivated: DAR",
"details": {
"dataset_id": "dar"
}
},
{
"timestamp": "2025-08-04T18:04:36.835754",
"level": "info",
"event_type": "plc_disconnection",
"message": "Disconnected from PLC 10.1.33.11 (stopped recording and streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T18:04:38.137863",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T18:04:38.144678",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T18:04:38.153439",
"level": "info",
"event_type": "plc_connection",
"message": "Successfully connected to PLC 10.1.33.11 and auto-started CSV recording for 1 datasets",
"details": {
"ip": "10.1.33.11",
"rack": 0,
"slot": 2,
"auto_started_recording": true,
"recording_datasets": 1,
"dataset_names": [
"DAR"
]
}
},
{
"timestamp": "2025-08-04T18:05:43.238109",
"level": "info",
"event_type": "application_started",
"message": "Application initialization completed successfully",
"details": {}
},
{
"timestamp": "2025-08-04T18:05:43.268832",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T18:05:43.279905",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T18:06:05.365991",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 1 datasets after PLC reconnection",
"details": {
"resumed_datasets": 1,
"total_attempted": 1
}
},
{
"timestamp": "2025-08-04T18:07:17.191446",
"level": "info",
"event_type": "csv_recording_stopped",
"message": "CSV recording stopped (dataset threads continue for UDP streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T18:07:17.201099",
"level": "info",
"event_type": "udp_streaming_stopped",
"message": "UDP streaming to PlotJuggler stopped (CSV recording continues)",
"details": {}
},
{
"timestamp": "2025-08-04T18:07:17.209470",
"level": "info",
"event_type": "dataset_deactivated",
"message": "Dataset deactivated: DAR",
"details": {
"dataset_id": "dar"
}
},
{
"timestamp": "2025-08-04T18:07:17.218767",
"level": "info",
"event_type": "plc_disconnection",
"message": "Disconnected from PLC 10.1.33.11 (stopped recording and streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T18:07:18.003819",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T18:07:18.015625",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T18:07:18.024429",
"level": "info",
"event_type": "plc_connection",
"message": "Successfully connected to PLC 10.1.33.11 and auto-started CSV recording for 1 datasets",
"details": {
"ip": "10.1.33.11",
"rack": 0,
"slot": 2,
"auto_started_recording": true,
"recording_datasets": 1,
"dataset_names": [
"DAR"
]
}
},
{
"timestamp": "2025-08-04T18:07:18.908974",
"level": "info",
"event_type": "csv_recording_stopped",
"message": "CSV recording stopped (dataset threads continue for UDP streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T18:07:18.916948",
"level": "info",
"event_type": "udp_streaming_stopped",
"message": "UDP streaming to PlotJuggler stopped (CSV recording continues)",
"details": {}
},
{
"timestamp": "2025-08-04T18:07:19.010537",
"level": "info",
"event_type": "dataset_deactivated",
"message": "Dataset deactivated: DAR",
"details": {
"dataset_id": "dar"
}
},
{
"timestamp": "2025-08-04T18:07:19.018557",
"level": "info",
"event_type": "plc_disconnection",
"message": "Disconnected from PLC 10.1.33.11 (stopped recording and streaming)",
"details": {}
},
{
"timestamp": "2025-08-04T18:07:20.149707",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T18:07:20.156522",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T18:07:20.166517",
"level": "info",
"event_type": "plc_connection",
"message": "Successfully connected to PLC 10.1.33.11 and auto-started CSV recording for 1 datasets",
"details": {
"ip": "10.1.33.11",
"rack": 0,
"slot": 2,
"auto_started_recording": true,
"recording_datasets": 1,
"dataset_names": [
"DAR"
]
}
},
{
"timestamp": "2025-08-04T18:13:09.258719",
"level": "info",
"event_type": "application_started",
"message": "Application initialization completed successfully",
"details": {}
},
{
"timestamp": "2025-08-04T18:13:09.288453",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T18:13:09.295297",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T18:22:22.719141",
"level": "info",
"event_type": "application_started",
"message": "Application initialization completed successfully",
"details": {}
},
{
"timestamp": "2025-08-04T18:22:22.746796",
"level": "info",
"event_type": "dataset_activated",
"message": "Dataset activated: DAR",
"details": {
"dataset_id": "dar",
"variables_count": 6,
"streaming_count": 4,
"prefix": "dar"
}
},
{
"timestamp": "2025-08-04T18:22:22.757666",
"level": "info",
"event_type": "csv_recording_started",
"message": "CSV recording started: 1 datasets activated",
"details": {
"activated_datasets": 1,
"total_datasets": 2
}
},
{
"timestamp": "2025-08-04T18:22:41.368297",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 1 datasets after PLC reconnection",
"details": {
"resumed_datasets": 1,
"total_attempted": 1
}
},
{
"timestamp": "2025-08-04T18:23:03.863916",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 1 datasets after PLC reconnection",
"details": {
"resumed_datasets": 1,
"total_attempted": 1
}
},
{
"timestamp": "2025-08-04T18:23:19.365853",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 1 datasets after PLC reconnection",
"details": {
"resumed_datasets": 1,
"total_attempted": 1
}
},
{
"timestamp": "2025-08-04T18:23:44.366505",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 1 datasets after PLC reconnection",
"details": {
"resumed_datasets": 1,
"total_attempted": 1
}
},
{
"timestamp": "2025-08-04T18:24:09.864835",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 1 datasets after PLC reconnection",
"details": {
"resumed_datasets": 1,
"total_attempted": 1
}
},
{
"timestamp": "2025-08-04T18:25:54.868463",
"level": "info",
"event_type": "datasets_resumed_after_reconnection",
"message": "Automatically resumed streaming for 1 datasets after PLC reconnection",
"details": {
"resumed_datasets": 1,
"total_attempted": 1
}
}
],
"last_updated": "2025-08-04T17:21:54.015891",
"total_entries": 506
"last_updated": "2025-08-04T18:25:54.868463",
"total_entries": 570
}

View File

@ -1,7 +1,8 @@
import snap7
import snap7.util
import struct
import logging
import time
import threading
from typing import Dict, Any, Optional
@ -14,6 +15,25 @@ class PLCClient:
self.plc = None
self.connected = False
# Connection configuration for reconnection
self.connection_config = {"ip": None, "rack": None, "slot": None}
# Reconnection state management
self.reconnection_enabled = True
self.is_reconnecting = False
self.reconnection_thread = None
self.reconnection_lock = threading.Lock()
self.last_connection_attempt = 0
self.consecutive_failures = 0
self.max_backoff_seconds = 60 # Maximum 1 minute backoff
self.base_backoff_seconds = 1 # Start with 1 second
# Callback system for reconnection events
self.reconnection_callbacks = [] # List of callback functions
self.disconnection_callbacks = (
[]
) # List of callback functions for disconnection tracking
def connect(self, ip: str, rack: int, slot: int) -> bool:
"""Connect to S7-315 PLC"""
try:
@ -24,8 +44,16 @@ class PLCClient:
self.plc.connect(ip, rack, slot)
self.connected = True
# Store connection configuration for reconnection
self.connection_config = {"ip": ip, "rack": rack, "slot": slot}
# Reset reconnection state on successful connection
self.consecutive_failures = 0
self.is_reconnecting = False
if self.logger:
self.logger.info(f"Successfully connected to PLC {ip}:{rack}/{slot}")
msg = f"Successfully connected to PLC {ip}:{rack}/{slot}"
self.logger.info(msg)
return True
except Exception as e:
@ -39,6 +67,15 @@ class PLCClient:
def disconnect(self):
"""Disconnect from PLC"""
try:
# Stop any ongoing reconnection attempts
self.reconnection_enabled = False
with self.reconnection_lock:
self.is_reconnecting = False
thread = self.reconnection_thread
if thread and thread.is_alive():
thread.join(timeout=2)
if self.plc:
self.plc.disconnect()
self.connected = False
@ -52,6 +89,136 @@ class PLCClient:
"""Check if PLC is connected"""
return self.connected and self.plc is not None
def _is_connection_error(self, error_str: str) -> bool:
"""Check if error is related to connection problems"""
connection_error_indicators = [
"connection timed out",
"connection abort",
"recv tcp",
"send tcp",
"socket",
"timeout",
"connection refused",
"connection reset",
"broken pipe",
"network is unreachable",
]
error_lower = str(error_str).lower()
return any(
indicator in error_lower for indicator in connection_error_indicators
)
def _calculate_backoff_delay(self) -> float:
"""Calculate exponential backoff delay"""
if self.consecutive_failures == 0:
return 0
# Exponential backoff: 1s, 2s, 4s, 8s, 16s, 32s, 60s (max)
power = self.consecutive_failures - 1
delay = self.base_backoff_seconds * (2**power)
return min(delay, self.max_backoff_seconds)
def _attempt_reconnection(self) -> bool:
"""Attempt to reconnect to PLC"""
if not self.connection_config["ip"]:
return False
try:
if self.logger:
ip = self.connection_config["ip"]
self.logger.info(f"Attempting reconnection to PLC {ip}...")
# Force disconnect first
if self.plc:
try:
self.plc.disconnect()
except Exception:
pass # Ignore errors during forced disconnect
# Create new client and attempt connection
self.plc = snap7.client.Client()
self.plc.connect(
self.connection_config["ip"],
self.connection_config["rack"],
self.connection_config["slot"],
)
self.connected = True
self.consecutive_failures = 0
if self.logger:
ip = self.connection_config["ip"]
self.logger.info(f"Successfully reconnected to PLC {ip}")
# Notify callbacks about successful reconnection
self._notify_reconnection_success()
return True
except Exception as e:
self.connected = False
self.consecutive_failures += 1
if self.logger:
self.logger.error(f"Reconnection attempt failed: {str(e)}")
return False
def _start_automatic_reconnection(self):
"""Start automatic reconnection in background thread"""
if not self.reconnection_enabled:
return
with self.reconnection_lock:
if self.is_reconnecting:
return # Already reconnecting
self.is_reconnecting = True
def reconnection_worker():
while self.reconnection_enabled and not self.connected:
try:
# Calculate delay based on consecutive failures
delay = self._calculate_backoff_delay()
if delay > 0:
if self.logger:
attempt_num = self.consecutive_failures + 1
msg = f"Waiting {delay}s before reconnection "
msg += f"attempt (attempt #{attempt_num})"
self.logger.info(msg)
time.sleep(delay)
# Check if we should still try to reconnect
if not self.reconnection_enabled:
break
# Attempt reconnection
if self._attempt_reconnection():
# Success - exit loop
break
# If we've reached max backoff, continue with max delay
if delay >= self.max_backoff_seconds and self.logger:
max_sec = self.max_backoff_seconds
msg = f"Max backoff reached, will retry every {max_sec}s"
self.logger.info(msg)
except Exception as e:
if self.logger:
self.logger.error(f"Error in reconnection worker: {e}")
time.sleep(1) # Wait a bit before continuing
# Reset reconnection state
with self.reconnection_lock:
self.is_reconnecting = False
# Start reconnection thread
self.reconnection_thread = threading.Thread(
target=reconnection_worker, daemon=True
)
self.reconnection_thread.start()
def read_variable(self, var_config: Dict[str, Any]) -> Any:
"""Read a specific variable from the PLC"""
if not self.is_connected():
@ -85,6 +252,32 @@ class PLCClient:
except Exception as e:
if self.logger:
self.logger.error(f"Error reading variable: {e}")
# Check if this is a connection error and start automatic reconnection
if self._is_connection_error(str(e)):
was_connected_before = self.connected
self.connected = False
self.consecutive_failures += 1
if self.logger:
failure_num = self.consecutive_failures
msg = (
"Connection error detected, starting automatic "
f"reconnection (failure #{failure_num})"
)
self.logger.warning(msg)
# If we were connected before, notify disconnection callbacks FIRST
if was_connected_before:
if self.logger:
self.logger.info(
"Notifying disconnection callbacks for dataset tracking"
)
self._notify_disconnection_detected()
# Start automatic reconnection in background
self._start_automatic_reconnection()
return None
def _read_db_variable(
@ -308,15 +501,15 @@ class PLCClient:
success_count += 1
else:
data[var_name] = None
errors[var_name] = (
"Read returned None - possible configuration error"
)
msg_error = "Read returned None - possible config error"
errors[var_name] = msg_error
except ConnectionError as e:
data[var_name] = None
errors[var_name] = f"Connection error: {str(e)}"
if self.logger:
self.logger.error(f"Connection error reading {var_name}: {e}")
msg = f"Connection error reading {var_name}: {e}"
self.logger.error(msg)
except TimeoutError as e:
data[var_name] = None
@ -328,13 +521,15 @@ class PLCClient:
data[var_name] = None
errors[var_name] = f"Configuration error: {str(e)}"
if self.logger:
self.logger.error(f"Configuration error for {var_name}: {e}")
msg = f"Configuration error for {var_name}: {e}"
self.logger.error(msg)
except Exception as e:
data[var_name] = None
errors[var_name] = f"Unexpected error: {str(e)}"
if self.logger:
self.logger.error(f"Unexpected error reading {var_name}: {e}")
msg = f"Unexpected error reading {var_name}: {e}"
self.logger.error(msg)
# Determine overall success
if success_count == 0:
@ -351,9 +546,12 @@ class PLCClient:
},
}
elif success_count < total_count:
warning_msg = (
f"Partial success: {success_count}/{total_count} " "variables read"
)
return {
"success": True,
"warning": f"Partial success: {success_count}/{total_count} variables read",
"warning": warning_msg,
"values": data,
"errors": errors,
"stats": {
@ -376,7 +574,79 @@ class PLCClient:
def get_connection_info(self) -> Dict[str, Any]:
"""Get current connection information"""
return {"connected": self.connected, "client_available": self.plc is not None}
has_ip = self.connection_config["ip"]
config_copy = self.connection_config.copy() if has_ip else None
delay = self._calculate_backoff_delay() if not self.connected else 0
return {
"connected": self.connected,
"client_available": self.plc is not None,
"reconnection_enabled": self.reconnection_enabled,
"is_reconnecting": self.is_reconnecting,
"consecutive_failures": self.consecutive_failures,
"connection_config": config_copy,
"next_reconnection_delay": delay,
}
def enable_automatic_reconnection(self, enabled: bool = True):
"""Enable or disable automatic reconnection"""
self.reconnection_enabled = enabled
if self.logger:
status = "enabled" if enabled else "disabled"
self.logger.info(f"Automatic reconnection {status}")
def is_reconnection_active(self) -> bool:
"""Check if reconnection is currently active"""
return self.is_reconnecting
def get_reconnection_status(self) -> Dict[str, Any]:
"""Get detailed reconnection status"""
next_delay = self._calculate_backoff_delay() if not self.connected else 0
return {
"enabled": self.reconnection_enabled,
"active": self.is_reconnecting,
"consecutive_failures": self.consecutive_failures,
"next_delay_seconds": next_delay,
"max_backoff_seconds": self.max_backoff_seconds,
"base_backoff_seconds": self.base_backoff_seconds,
}
def add_reconnection_callback(self, callback_func):
"""Add a callback function to be called when reconnection is successful"""
if callback_func not in self.reconnection_callbacks:
self.reconnection_callbacks.append(callback_func)
def remove_reconnection_callback(self, callback_func):
"""Remove a reconnection callback function"""
if callback_func in self.reconnection_callbacks:
self.reconnection_callbacks.remove(callback_func)
def add_disconnection_callback(self, callback_func):
"""Add a callback function to be called when disconnection is detected"""
if callback_func not in self.disconnection_callbacks:
self.disconnection_callbacks.append(callback_func)
def remove_disconnection_callback(self, callback_func):
"""Remove a disconnection callback function"""
if callback_func in self.disconnection_callbacks:
self.disconnection_callbacks.remove(callback_func)
def _notify_reconnection_success(self):
"""Notify all registered callbacks about successful reconnection"""
for callback in self.reconnection_callbacks:
try:
callback()
except Exception as e:
if self.logger:
self.logger.error(f"Error in reconnection callback: {e}")
def _notify_disconnection_detected(self):
"""Notify all registered callbacks about connection loss detection"""
for callback in self.disconnection_callbacks:
try:
callback()
except Exception as e:
if self.logger:
self.logger.error(f"Error in disconnection callback: {e}")
def test_connection(self, ip: str, rack: int, slot: int) -> bool:
"""Test connection to PLC without permanently connecting"""

View File

@ -172,6 +172,9 @@ class PLCDataStreamer:
if self.logger:
self.logger.warning(f"Error deactivating dataset {dataset_id}: {e}")
# Clear any tracked datasets for auto-resume since this is manual disconnect
self.data_streamer.datasets_before_disconnection.clear()
# Disconnect from PLC
self.plc_client.disconnect()
@ -360,6 +363,9 @@ class PLCDataStreamer:
),
"sampling_interval": self.config_manager.sampling_interval,
"disk_space_info": self.get_disk_space_info(),
# Add reconnection status information
"plc_reconnection": self.plc_client.get_reconnection_status(),
"plc_connection_info": self.plc_client.get_connection_info(),
}
return status

View File

@ -66,6 +66,17 @@ class DataStreamer:
# 📈 PLOT MANAGER - Real-time plotting system
self.plot_manager = PlotManager(event_logger, logger)
# 🔄 RECONNECTION MANAGEMENT - Track datasets for auto-resume
self.datasets_before_disconnection = (
set()
) # Track active datasets before disconnection
# Register for reconnection callbacks
self.plc_client.add_reconnection_callback(self._on_plc_reconnected)
# Register for disconnection callbacks
self.plc_client.add_disconnection_callback(self._on_plc_disconnected)
def setup_udp_socket(self) -> bool:
"""Setup UDP socket for PlotJuggler communication"""
try:
@ -587,8 +598,24 @@ class DataStreamer:
if dataset_id not in self.config_manager.datasets:
return False
# Check if there's already a running thread
if dataset_id in self.dataset_threads:
return True # Already running
existing_thread = self.dataset_threads[dataset_id]
if existing_thread.is_alive():
if self.logger:
dataset_name = self.config_manager.datasets[dataset_id]["name"]
self.logger.info(
f"Dataset '{dataset_name}' thread is already running, skipping creation"
)
return True # Already running and alive
else:
# Clean up dead thread
if self.logger:
dataset_name = self.config_manager.datasets[dataset_id]["name"]
self.logger.info(
f"Cleaning up dead thread for dataset '{dataset_name}'"
)
del self.dataset_threads[dataset_id]
# Create and start thread for this dataset
thread = threading.Thread(
@ -816,7 +843,7 @@ class DataStreamer:
self.event_logger.log_event(
"info",
"udp_streaming_started",
f"UDP streaming to PlotJuggler started",
"UDP streaming to PlotJuggler started",
{
"udp_host": self.config_manager.udp_config["host"],
"udp_port": self.config_manager.udp_config["port"],
@ -859,6 +886,11 @@ class DataStreamer:
"""Check if UDP streaming is active"""
return self.udp_streaming_enabled
@property
def streaming(self) -> bool:
"""Get streaming status (backward compatibility)"""
return self.udp_streaming_enabled
def is_csv_recording(self) -> bool:
"""Check if CSV recording is active"""
return self.csv_recording_enabled and bool(self.dataset_csv_files)
@ -988,3 +1020,182 @@ class DataStreamer:
"open_csv_files": len(self.dataset_csv_files),
"udp_socket_active": self.udp_socket is not None,
}
def _on_plc_reconnected(self):
"""Callback method called when PLC reconnects successfully"""
if self.logger:
self.logger.info("PLC reconnection detected, resuming dataset streaming...")
# Resume streaming for datasets that were active before disconnection
datasets_to_resume = self.datasets_before_disconnection.copy()
resumed_count = 0
if self.logger:
self.logger.info(f"Datasets to resume: {list(datasets_to_resume)}")
self.logger.info(
f"Current active datasets: {list(self.config_manager.active_datasets)}"
)
self.logger.info(
f"Current dataset threads: {list(self.dataset_threads.keys())}"
)
if not datasets_to_resume:
if self.logger:
self.logger.warning(
"No datasets were tracked for auto-resume. This might indicate a tracking issue."
)
return
for dataset_id in datasets_to_resume:
try:
dataset_name = self.config_manager.datasets.get(dataset_id, {}).get(
"name", dataset_id
)
if self.logger:
self.logger.info(
f"Attempting to resume dataset '{dataset_name}' (ID: {dataset_id})"
)
# Check if dataset still exists and is supposed to be active
dataset_exists = dataset_id in self.config_manager.datasets
dataset_active = dataset_id in self.config_manager.active_datasets
if self.logger:
self.logger.info(
f"Dataset '{dataset_name}': exists={dataset_exists}, active={dataset_active}"
)
if dataset_exists and dataset_active:
# Check if thread is not already running
thread_exists = dataset_id in self.dataset_threads
thread_alive = (
thread_exists and self.dataset_threads[dataset_id].is_alive()
)
if self.logger:
self.logger.info(
f"Dataset '{dataset_name}': thread_exists={thread_exists}, thread_alive={thread_alive}"
)
if not thread_exists or not thread_alive:
if self.logger:
self.logger.info(
f"Starting streaming thread for dataset '{dataset_name}'"
)
self.start_dataset_streaming(dataset_id)
resumed_count += 1
if self.logger:
self.logger.info(
f"Successfully resumed streaming for dataset '{dataset_name}'"
)
else:
if self.logger:
self.logger.info(
f"Dataset '{dataset_name}' thread is already running, skipping"
)
else:
if self.logger:
self.logger.warning(
f"Dataset '{dataset_name}' no longer exists or is not active, skipping resume"
)
except Exception as e:
if self.logger:
self.logger.error(f"Error resuming dataset {dataset_id}: {e}")
# Clear the tracking set since we've attempted to resume all
self.datasets_before_disconnection.clear()
if self.logger:
self.logger.info(
f"Resumed streaming for {resumed_count} datasets after reconnection"
)
# Log event for monitoring
self.event_logger.log_event(
"info",
"datasets_resumed_after_reconnection",
f"Automatically resumed streaming for {resumed_count} datasets after PLC reconnection",
{
"resumed_datasets": resumed_count,
"total_attempted": len(datasets_to_resume),
},
)
def _on_plc_disconnected(self):
"""Callback method called when PLC disconnection is detected"""
if self.logger:
self.logger.info(
"PLC disconnection detected, tracking active datasets for auto-resume"
)
self._track_disconnection()
def _track_disconnection(self):
"""Track which datasets were active before disconnection for auto-resume"""
# Only track if we haven't already tracked (avoid duplicate tracking)
if self.datasets_before_disconnection:
if self.logger:
self.logger.info(
"Datasets already tracked, skipping duplicate tracking"
)
return
# Store currently active datasets that have running threads
tracked_datasets = set()
if self.logger:
self.logger.info(
f"Active datasets in config: {list(self.config_manager.active_datasets)}"
)
self.logger.info(
f"Current dataset threads: {list(self.dataset_threads.keys())}"
)
for dataset_id in self.config_manager.active_datasets:
if dataset_id in self.dataset_threads:
thread = self.dataset_threads[dataset_id]
thread_alive = thread.is_alive()
if self.logger:
dataset_name = self.config_manager.datasets.get(dataset_id, {}).get(
"name", dataset_id
)
self.logger.info(
f"Dataset '{dataset_name}' (ID: {dataset_id}): thread_alive={thread_alive}"
)
if thread_alive:
tracked_datasets.add(dataset_id)
if self.logger:
dataset_name = self.config_manager.datasets.get(
dataset_id, {}
).get("name", dataset_id)
self.logger.info(
f"Tracked dataset '{dataset_name}' for auto-resume"
)
else:
if self.logger:
dataset_name = self.config_manager.datasets.get(dataset_id, {}).get(
"name", dataset_id
)
self.logger.info(
f"Dataset '{dataset_name}' has no thread - not tracked"
)
self.datasets_before_disconnection = tracked_datasets
if self.logger:
count = len(self.datasets_before_disconnection)
if count > 0:
dataset_names = [
self.config_manager.datasets.get(ds_id, {}).get("name", ds_id)
for ds_id in self.datasets_before_disconnection
]
self.logger.info(
f"Tracked {count} active datasets for auto-resume: {dataset_names}"
)
else:
self.logger.warning("No active datasets found to track for auto-resume")

50
main.py
View File

@ -1517,6 +1517,56 @@ def get_status():
return jsonify(streamer.get_status())
@app.route("/api/plc/reconnection/status")
def get_plc_reconnection_status():
"""Get detailed PLC reconnection status"""
error_response = check_streamer_initialized()
if error_response:
return error_response
try:
status = streamer.plc_client.get_reconnection_status()
connection_info = streamer.plc_client.get_connection_info()
return jsonify(
{"success": True, "reconnection": status, "connection": connection_info}
)
except Exception as e:
return jsonify({"success": False, "error": str(e)}), 500
@app.route("/api/plc/reconnection/enable", methods=["POST"])
def enable_plc_reconnection():
"""Enable automatic PLC reconnection"""
error_response = check_streamer_initialized()
if error_response:
return error_response
try:
streamer.plc_client.enable_automatic_reconnection(True)
return jsonify(
{"success": True, "message": "Automatic PLC reconnection enabled"}
)
except Exception as e:
return jsonify({"success": False, "error": str(e)}), 500
@app.route("/api/plc/reconnection/disable", methods=["POST"])
def disable_plc_reconnection():
"""Disable automatic PLC reconnection"""
error_response = check_streamer_initialized()
if error_response:
return error_response
try:
streamer.plc_client.enable_automatic_reconnection(False)
return jsonify(
{"success": True, "message": "Automatic PLC reconnection disabled"}
)
except Exception as e:
return jsonify({"success": False, "error": str(e)}), 500
@app.route("/api/events")
def get_events():
"""Get recent events from the application log"""

View File

@ -70,5 +70,5 @@
],
"current_dataset_id": "dar",
"version": "1.0",
"last_update": "2025-08-04T17:21:54.008091"
"last_update": "2025-08-04T18:22:22.744860"
}

View File

@ -45,6 +45,26 @@
color: var(--pico-secondary-inverse);
}
.status-reconnecting {
background: #ff9800;
color: white;
animation: pulse 2s infinite;
}
@keyframes pulse {
0% {
opacity: 1;
}
50% {
opacity: 0.7;
}
100% {
opacity: 1;
}
}
.status-streaming {
background: var(--pico-primary-background);
color: var(--pico-primary-inverse);

View File

@ -16,9 +16,16 @@ function updateStatus() {
const csvStatus = document.getElementById('csv-status');
const diskSpaceStatus = document.getElementById('disk-space');
// Actualizar estado de conexión PLC
// Actualizar estado de conexión PLC con información de reconexión
if (data.plc_connected) {
plcStatus.innerHTML = '🔌 PLC: Connected <div style="margin-top: 8px;"><button type="button" id="status-disconnect-btn">❌ Disconnect</button></div>';
const reconnectionInfo = data.plc_reconnection || {};
let reconnectionStatus = '';
if (reconnectionInfo.enabled) {
reconnectionStatus = '<div style="font-size: 0.8em; color: #666;">🔄 Auto-reconnection: enabled</div>';
}
plcStatus.innerHTML = `🔌 PLC: Connected ${reconnectionStatus}<div style="margin-top: 8px;"><button type="button" id="status-disconnect-btn">❌ Disconnect</button></div>`;
plcStatus.className = 'status-item status-connected';
// Añadir event listener al nuevo botón de desconexión
@ -31,8 +38,38 @@ function updateStatus() {
});
});
} else {
plcStatus.innerHTML = '🔌 PLC: Disconnected <div style="margin-top: 8px;"><button type="button" id="status-connect-btn">🔗 Connect</button></div>';
plcStatus.className = 'status-item status-disconnected';
const reconnectionInfo = data.plc_reconnection || {};
const connectionInfo = data.plc_connection_info || {};
let statusText = '🔌 PLC: Disconnected';
let statusClass = 'status-item status-disconnected';
let reconnectionDetails = '';
// Mostrar información de reconexión si está habilitada
if (reconnectionInfo.enabled) {
if (reconnectionInfo.active) {
statusText = '🔌 PLC: Reconnecting...';
statusClass = 'status-item status-reconnecting';
const nextDelay = reconnectionInfo.next_delay_seconds || 0;
const failures = reconnectionInfo.consecutive_failures || 0;
if (nextDelay > 0) {
reconnectionDetails = `<div style="font-size: 0.8em; color: #ff9800;">🔄 Next attempt in ${nextDelay}s (failure #${failures})</div>`;
} else {
reconnectionDetails = '<div style="font-size: 0.8em; color: #ff9800;">🔄 Attempting reconnection...</div>';
}
} else {
if (reconnectionInfo.consecutive_failures > 0) {
reconnectionDetails = `<div style="font-size: 0.8em; color: #666;">🔄 Auto-reconnection: enabled (${reconnectionInfo.consecutive_failures} failures)</div>`;
} else {
reconnectionDetails = '<div style="font-size: 0.8em; color: #666;">🔄 Auto-reconnection: enabled</div>';
}
}
}
plcStatus.innerHTML = `${statusText} ${reconnectionDetails}<div style="margin-top: 8px;"><button type="button" id="status-connect-btn">🔗 Connect</button></div>`;
plcStatus.className = statusClass;
// Añadir event listener al botón de conexión
document.getElementById('status-connect-btn').addEventListener('click', function () {

View File

@ -7,5 +7,5 @@
]
},
"auto_recovery_enabled": true,
"last_update": "2025-08-04T17:21:54.020081"
"last_update": "2025-08-04T18:22:22.763087"
}