ParamManagerScripts/backend/script_groups/XML Parser to SCL/xref_info.md

193 lines
11 KiB
Markdown

## Technical Documentation: Parsing TIA Portal `_XRef.xml` Files for Call Tree Generation
**Version:** 1.0 **Date:** 2025-05-05
### 1. Introduction
This document describes the structure and interpretation of the XML files (`*_XRef.xml`) generated by the TIA Portal Openness `export_cross_references` function (available via libraries like `siemens_tia_scripting`). The primary goal is to enable software developers to programmatically parse these files to extract block call relationships and build a comprehensive call tree for a PLC program.
The `_XRef.xml` file contains detailed information about all objects referenced _by_ a specific source object (e.g., an OB, FB, or FC). By processing these files for all relevant blocks, a complete picture of the program's call structure can be assembled.
### 2. File Format Overview
The `_XRef.xml` file is a standard XML document. Its high-level structure typically looks like this:
XML
```xml
<?xml version="1.0" encoding="utf-8"?>
<CrossReferences xmlns:i="..." xmlns="...">
<Sources>
<SourceObject> <Name>...</Name>
<Address>...</Address>
<Device>...</Device>
<Path>...</Path>
<TypeName>...</TypeName>
<UnderlyingObject>...</UnderlyingObject>
<Children />
<References> <ReferenceObject> <Name>...</Name>
<Address>...</Address>
<Device>...</Device>
<Path>...</Path>
<TypeName>...</TypeName> <UnderlyingObject>...</UnderlyingObject>
<Locations> <Location>
<Access>...</Access> <Address>...</Address>
<Name>...</Name>
<ReferenceLocation>...</ReferenceLocation> <ReferenceType>Uses</ReferenceType>
</Location>
</Locations>
</ReferenceObject>
</References>
</SourceObject>
</Sources>
</CrossReferences>
```
### 3. Key XML Elements for Call Tree Construction
To build a call tree, you need to identify the _caller_ and the _callee_ for each block call. The following XML elements are essential:
1. **`<SourceObject>`:** Represents the block performing the calls (the **caller**).
- **`<Name>`:** The symbolic name of the caller block (e.g., `_CYCL_EXC`).
- **`<Address>`:** The absolute address (e.g., `%OB1`).
- **`<TypeName>`:** The type of the caller block (e.g., `LAD-Organization block`).
2. **`<ReferenceObject>`:** Represents an object being referenced by the `SourceObject`. This _could_ be the **callee**.
- **`<Name>`:** The symbolic name of the referenced object (e.g., `BlenderCtrl__Main`, `Co2_Counters_DB`).
- **`<Address>`:** The absolute address (e.g., `%FC2000`, `%DB1021`).
- **`<TypeName>`:** The type of the referenced object (e.g., `LAD-Function`, `Instance DB of Co2_Counters [FB1020]`). This is vital for identifying FCs and FBs (via their instance DBs).
3. **`<Location>`:** Specifies exactly how and where the `ReferenceObject` is used within the `SourceObject`.
- **`<Access>`:** **This is the most critical element for call trees.** Look for the value `Call`. This indicates a direct Function Call (FC). An access type of `InstanceDB` indicates the usage of an instance DB, which implies a Function Block (FB) call is occurring.
- **`<ReferenceLocation>`:** Provides human-readable context about where the reference occurs within the caller's code (e.g., `@_CYCL_EXC ▶ NW3 (Blender CTRL)`). Useful for debugging or visualization.
### 4. Data Extraction Strategy for Call Tree
A program parsing these files should follow these steps for each `_XRef.xml` file:
1. **Parse XML:** Load the `_XRef.xml` file using a suitable XML parsing library (e.g., Python's `xml.etree.ElementTree` or `lxml`).
2. **Identify Caller:** Navigate to the `<SourceObject>` element and extract its `<Name>`. This is the caller block for all references within this file.
3. **Iterate References:** Loop through each `<ReferenceObject>` within the `<References>` section of the `<SourceObject>`.
4. **Iterate Locations:** For each `<ReferenceObject>`, loop through its `<Location>` elements.
5. **Filter for Calls:** Check the text content of the `<Access>` tag within each `<Location>`.
- **If `Access` is `Call`:**
- The `<Name>` of the current `<ReferenceObject>` is the **callee** (an FC).
- Record the relationship: `Caller Name` -> `Callee Name (FC)`.
- **If `Access` is `InstanceDB`:**
- This signifies an FB call is happening using this instance DB.
- The `<Name>` of the current `<ReferenceObject>` is the Instance DB name (e.g., `Co2_Counters_DB`).
- To find the actual FB being called, examine the `<TypeName>` of this `ReferenceObject`. It usually contains the FB name/number (e.g., `Instance DB of Co2_Counters [FB1020]`). Extract the FB name (`Co2_Counters`) or number (`FB1020`). This is the **callee**.
- Record the relationship: `Caller Name` -> `Callee Name (FB)`.
6. **Store Relationships:** Store the identified caller-callee pairs in a suitable data structure.
### 5. Building the Call Tree Data Structure
After parsing one or more `_XRef.xml` files, the extracted relationships can be stored. Common approaches include:
- **Dictionary (Adjacency List):** A dictionary where keys are caller names and values are lists of callee names.
Python
```python
call_tree = {
'_CYCL_EXC': ['BlenderCtrl__Main', 'MessageScroll', 'ITC_MainRoutine', 'Co2_Counters', 'ProcedureProdBrixRecovery', 'Key Read & Write', 'GNS_PLCdia_MainRoutine'],
'BlenderCtrl__Main': ['SomeOtherBlock', ...],
# ... other callers
}
```
- **Graph Representation:** Using libraries like `networkx` in Python to create a directed graph where blocks are nodes and calls are edges. This allows for more complex analysis (e.g., finding paths, cycles).
- **Custom Objects:** Define `Block` and `Call` classes for a more object-oriented representation.
### 6. Handling Multiple Files
A single `_XRef.xml` file only details the references _from_ one `SourceObject`. To build a complete call tree for the entire program or PLC:
1. **Export References:** Use the Openness script to call `export_cross_references` for _all_ relevant OBs, FBs, and FCs in the project.
2. **Process All Files:** Run the parsing logic described above on each generated `_XRef.xml` file.
3. **Aggregate Results:** Combine the caller-callee relationships extracted from all files into a single data structure (e.g., merge dictionaries or add nodes/edges to the graph).
### 7. Example (Conceptual Python using `xml.etree.ElementTree`)
Python
```python
import xml.etree.ElementTree as ET
import re # For extracting FB name from TypeName
def parse_xref_for_calls(xml_file_path):
"""Parses a _XRef.xml file and extracts call relationships."""
calls = {} # {caller: [callee1, callee2, ...]}
try:
tree = ET.parse(xml_file_path)
root = tree.getroot()
# Namespace handling might be needed depending on the xmlns
ns = {'ns0': 'TestNamespace1'} # Adjust namespace if different in your file
for source_object in root.findall('.//ns0:SourceObject', ns):
caller_name = source_object.findtext('ns0:Name', default='UnknownCaller', namespaces=ns)
if caller_name not in calls:
calls[caller_name] = []
for ref_object in source_object.findall('.//ns0:ReferenceObject', ns):
ref_name = ref_object.findtext('ns0:Name', default='UnknownRef', namespaces=ns)
ref_type_name = ref_object.findtext('ns0:TypeName', default='', namespaces=ns)
for location in ref_object.findall('.//ns0:Location', ns):
access_type = location.findtext('ns0:Access', default='', namespaces=ns)
if access_type == 'Call':
# Direct FC call
if ref_name not in calls[caller_name]:
calls[caller_name].append(ref_name)
elif access_type == 'InstanceDB':
# FB call via Instance DB
# Extract FB name/number from TypeName (e.g., "Instance DB of BlockName [FB123]")
match = re.search(r'Instance DB of (.*?) \[([A-Za-z]+[0-9]+)\]', ref_type_name)
callee_fb_name = 'UnknownFB'
if match:
# Prefer symbolic name if available, else use number
callee_fb_name = match.group(1) if match.group(1) else match.group(2)
elif 'Instance DB of' in ref_type_name: # Fallback if regex fails
callee_fb_name = ref_type_name.split('Instance DB of ')[-1].strip()
if callee_fb_name not in calls[caller_name]:
calls[caller_name].append(callee_fb_name)
except ET.ParseError as e:
print(f"Error parsing XML file {xml_file_path}: {e}")
except FileNotFoundError:
print(f"Error: File not found {xml_file_path}")
# Clean up entries with no calls
calls = {k: v for k, v in calls.items() if v}
return calls
# --- Aggregation Example ---
# all_calls = {}
# for xref_file in list_of_all_xref_files:
# file_calls = parse_xref_for_calls(xref_file)
# for caller, callees in file_calls.items():
# if caller not in all_calls:
# all_calls[caller] = []
# for callee in callees:
# if callee not in all_calls[caller]:
# all_calls[caller].append(callee)
# print(all_calls)
```
_Note: Namespace handling (`ns=...`) in ElementTree might need adjustment based on the exact default namespace declared in your XML files._
### 8. Considerations
- **Function Block Calls:** Remember that FB calls are identified indirectly via the `InstanceDB` access type and parsing the `<TypeName>` of the `ReferenceObject`.
- **System Blocks (SFC/SFB):** Calls to system functions/blocks should appear similarly to FC/FB calls and can be included in the tree. Their `<TypeName>` might indicate they are system blocks.
- **TIA Portal Versions:** While the basic structure is consistent, minor variations in tags or namespaces might exist between different TIA Portal versions. Always test with exports from your specific version.
- **Data References:** This documentation focuses on the call tree. The XML also contains `Read`, `Write`, `RW` access types, which can be parsed similarly to build a full cross-reference map for tags and data blocks.
### 9. Conclusion
The `_XRef.xml` files provide a detailed, machine-readable description of block references within a TIA Portal project. By parsing the XML structure, focusing on the `<SourceObject>`, `<ReferenceObject>`, and specifically the `<Access>` tag within `<Location>`, developers can reliably extract block call information and construct program call trees for analysis, documentation, or visualization purposes. Remember to aggregate data from multiple files for a complete program overview.