Work Order Guide¶
Foundation: BBC · DATA_MODEL · BIM_COBOL · MANIFESTO · TestArchitecture
Key insight: The classification YAML (
classify_*.yaml) is a human-readable stand-in for ERP entities. Each YAML section maps to an iDempiere table:
YAML Section ERP Entity What It Defines buildingC_Order Which product to build (one order = one building) storeys/segmentsC_OrderLine BOM explosion levels (floors, segments) floor_roomsC_OrderLine + M_AttributeSet IFC space → BOM template mapping static_childrenC_OrderLine (static entries) Fixed components (slabs, roof, MEP trunk) compositionRef_Order_ID (inheritance) Mirror/repeat pattern In the production path, these definitions come from the BIM Designer UI or future iDempiere REST integration. The YAML is the Order input and onboarding tool — how buildings enter the system. IFC extraction is IFC-driven (family types, spatial containment). YAML defines the Order: how extracted elements are organised into the BOM tree.
Quick Start — For Users¶
Everything is already built. 34 buildings are onboarded, BOMs are generated, the component library is populated. You don't need to run the pipeline to start working.
mvn compile -q
# Start the BIM Designer server (Blender/Bonsai connects here)
mvn exec:java -pl BonsaiBIMDesigner \
-Dexec.mainClass="com.bim.designer.api.DesignerServer" \
-Dexec.args="library 9876" -q
# Or start the Web UI
./scripts/run_webui.sh
What you can do immediately:
- Design buildings in Bonsai (BOM Drop → select a building product → compile)
- Browse building data in the Web UI (BOM trees, 4D-7D reports)
- Create C_OrderLines referencing existing products from the library
- Run ./scripts/run_RosettaStones.sh classify_sh.yaml to verify a building compiles
The design workflow (BOM Drop — BIM_Designer_SRS.md §28):
1. Pick a building product (e.g. BUILDING_SH_STD) → creates 1 C_OrderLine
2. Compiler explodes the BOM tree → all elements appear in the viewport
3. Navigate the BOM Outliner to swap/add/remove components
4. Each swap is a new C_OrderLine pointing to a different M_Product
5. Compile again → updated output
See BIM_Designer_UserGuide.md for the full walkthrough.
The Invention Boundary¶
The classification YAML (classify_*.yaml) is the only human-crafted artifact in the BIM compiler pipeline. Everything else is deterministic:
| Layer | Source | Invented? | Code |
|---|---|---|---|
| YAML | Human/AI author | YES — the only point of invention | classify_*.yaml |
| YAML parsing | YAML → config records | No | ClassificationYaml.java |
| Extraction | Reference DB → I_Element_Extraction |
No — reads data | ExtractionPopulator.java |
| Product link | M_Product_ID = element_ref |
No — deterministic | ExtractionPopulator.java:150 |
| Geometry gap fill | Import missing meshes from ref DB | No — copies blobs | ExtractionPopulator.fillGeometryGaps() |
| Product images | M_Product_ID → geometry_hash |
No — join | ProductRegistrar.ensureProductImages() |
| Product registration | M_Product in component_library.db | No — from extraction | ProductRegistrar.ensureProductCatalog() |
| Scope spaces | Element → room assignment | No — IFC spatial containment (scope box fallback) | ScopeBomBuilder.java |
| Composition | Mirror partition → half-unit BOM | No — axis-agnostic algo | CompositionBomBuilder.java |
| Structural BOM | BUILDING + FLOOR STR BOMs | No — from extraction | StructuralBomBuilder.java |
| Room BOMs | Static children from YAML | No — template refs | FloorRoomBomBuilder.java |
| QA validation | Pre-commit gate | No — asserts | BomValidator.java |
| Pipeline orchestrator | Steps 1–11 in order | No | IFCtoBOMPipeline.java |
| Compilation | BOM + reference DB → output | No — resolves geometry | DAGCompiler/.../dsl/CompilationPipeline.java |
| Shell driver | Runs pipeline + delta tests | No | scripts/run_RosettaStones.sh |
Rule: If you need to change the pipeline output, change the YAML. Never patch data manually.
The YAML Fidelity Mantra¶
The YAML is the single source of intent. The compiler's job is to obey it.
The compiler does NOT open the reference IFC or its extracted DB during compilation (verified). The BOM stores parent-relative offsets, not absolute coordinates (verified). But neither fact proves the compiler is faithful to the YAML that produced the BOM.
The process of truth: 1. YAML declares Order intent (storey mapping, static children, IFC space → template mapping, disciplines) 2. BOM builders translate YAML →
m_bom+m_bom_linewith relative dx/dy/dz 3. BOMWalker walks the hierarchy → output elements 4. Proof: If you mutate a YAML value and recompile, the output must change accordinglyTestable questions: - Change a storey
dz→ does the output shift by exactly that delta? - Add astatic_childrenentry → does it appear at the declared offset? - Remove ascope_spacesentry → do those elements fall back to FLOOR STR? - Change achild_product_id→ does the output use the new product?Until these mutations are tested, the proof for extracted buildings is "lossless round-trip", not "the compiler obeys its instructions."
See
LAST_MILE_PROBLEM.md§Gap 4 (R4) for status.
File Convention¶
IFC source files: DAGCompiler/lib/input/IFC/*.ifc ← SOURCE (never generated)
Extracted ref DBs: DAGCompiler/lib/input/*_extracted.db ← one-time extraction from IFC
Classification YAML: IFCtoBOM/src/main/resources/classify_*.yaml ← human intent (the only invention)
DSL scripts: IFCtoBOM/src/main/resources/dsl_*.bim ← building grammar
BOM databases: library/{PREFIX}_BOM.db ← generated by IFCtoBOM pipeline
Component library: library/component_library.db ← master catalog (LFS-tracked base)
Compiled output: DAGCompiler/lib/output/{type}.db ← generated by compiler
Clean Slate — Running from Scratch¶
Everything below the IFC source files and extracted reference DBs is regenerable. To verify the pipeline works end-to-end from zero:
# 1. Archive generated databases
rm -f library/*_BOM.db # BOM recipes (regenerated by IFCtoBOM)
rm -f DAGCompiler/lib/output/*.db # compiled output (regenerated by compiler)
git checkout -- library/component_library.db # restore LFS base (geometry + definitions)
# 2. Run the full pipeline (populate + IFCtoBOM + compile + test)
./scripts/run_RosettaStones.sh # all buildings
./scripts/run_RosettaStones.sh classify_sh.yaml # or one building
run_RosettaStones.sh handles everything: populates component_library.db with
products and geometry links (skips if already done), runs IFCtoBOM to produce
*_BOM.db, compiles to output, and runs G1-G6 gate tests + C8/C9 fidelity checks.
What lives where:
| Artifact | Generated by | Persisted | Regenerable? |
|---|---|---|---|
*.ifc |
External (architect) | Git LFS | No — source |
*_extracted.db |
tools/extract.py (one-time) |
gitignored | Yes — from IFC (slow, needs Python) |
component_library.db |
LFS base + populate step | Git LFS (base) | Yes — base from LFS, runtime tables from populate |
{PREFIX}_BOM.db |
IFCtoBOM pipeline | gitignored | Yes — from extracted DB + YAML |
output/*.db |
DAGCompiler | gitignored | Yes — from BOM + library |
Source IFC files (in DAGCompiler/lib/input/IFC/):
| File | Schema | Status |
|---|---|---|
Ifc4_SampleHouse.ifc |
IFC4 | Onboarded (SH) |
FZK_Haus_IFC4.ifc |
IFC4 | Onboarded (FK) |
AC11_Institute_IFC2x3.ifc |
IFC2x3 | Onboarded (IN) |
Ifc2x3_Duplex_*.ifc |
IFC2x3 | Onboarded (DX) |
SJTII-*.ifc (7 discipline files) |
IFC2x3 | Onboarded (TE) |
PCERT_Infra_Bridge_IFC4X3.ifc |
IFC4X3 | Onboarded (BR) |
PCERT_Infra_Road_IFC4X3.ifc |
IFC4X3 | Onboarded (RD) |
PCERT_Infra_Rail_IFC4X3.ifc |
IFC4X3 | Onboarded (RL) |
FJK_Project_IFC2x3.ifc |
IFC2x3 | — |
Smiley_West_IFC2x3.ifc |
IFC2x3 | — |
Vogel_Gesamt_IFC2x3.ifc |
IFC2x3 | — |
PCERT_Building_Architecture_IFC4X3.ifc |
IFC4X3 | — |
PCERT_Building_Hvac_IFC4X3.ifc |
IFC4X3 | — |
PCERT_Building_Structural_IFC4X3.ifc |
IFC4X3 | — |
PCERT_Infra_Plumbing_IFC4X3.ifc |
IFC4X3 | — |
Template generator (auto-detects storeys from reference DB):
mvn exec:java -pl IFCtoBOM \
-Dexec.mainClass="com.bim.ifctobom.NewBuildingGenerator" \
-Dexec.args="--prefix XX --type BuildingType --name 'Name'" -q
Full onboarding process: IFC_ONBOARDING_RUNBOOK.md
Classification YAML files:
classify_sh.yaml— Ifc4_SampleHouseclassify_dx.yaml— Ifc2x3_Duplex (1,099 elements)classify_fk.yaml— Ifc4_FZKHaus (82 elements)classify_in.yaml— Ifc2x3_AC11Institute (699 elements)classify_te.yaml— SJTII_Terminal (48,428 elements)classify_br.yaml— PCERT_Infra_Bridge (48 elements)classify_rd.yaml— PCERT_Infra_Road (53 elements)classify_rl.yaml— PCERT_Infra_Rail (73 elements)classify_dm.yaml— DemoHouse (template/generative mode)
Schema (v1)¶
building (required)¶
| Field | Type | Description |
|---|---|---|
building_type |
string | Must match reference DB name: {building_type}_extracted.db |
prefix |
string | Short code (SH, DX, TE). Used for BOM DB name: {prefix}_BOM.db |
building_bom_id |
string | Root BOM ID (e.g., BUILDING_SH_STD) |
doc_sub_type |
string | Building prefix (SH/DX/TE) |
name |
string | Human-readable building name |
dsl_file |
string | BIM COBOL script filename (e.g., dsl_sh.bim) |
storeys or segments (required)¶
Parsed by ClassificationYaml.java:94. Consumed by StructuralBomBuilder.java:83 to create per-segment FLOOR STR BOMs.
Maps segment names (from IFC spatial structure) to classification metadata.
For buildings, segments are storeys (IfcBuildingStorey). For infrastructure,
segments are facility parts (IfcRoadPart, IfcBridgePart, IfcRailwayPart).
The parser accepts either storeys: or segments: as the YAML key — they are
aliases. See InfrastructureAnalysis.md §4.2 for mapping.
storeys:
Ground Floor: { code: GF, bom_category: GF, role: GROUND_FLOOR, seq: 1010 }
Roof: { code: ROOF, bom_category: RF, role: ROOF, seq: 1020 }
| Field | Description | ERP Mapping |
|---|---|---|
code |
Short code for BOM ID: {prefix}_{code}_STR |
M_Product.Value |
bom_category |
Category tag on the FLOOR BOM | M_Product_Category |
role |
Role string on the MAKE child in BUILDING BOM | C_OrderLine.Description |
seq |
Sequence number for ordering in BUILDING BOM | C_OrderLine.Line |
Key rules:
- Every storey name in the reference DB must have a matching key here. Unmapped storeys are silently dropped (with a warning).
- Each storey code must be unique within the building. The code becomes the BOM ID ({prefix}_{code}_STR). If two IFC storeys share the same code, the BOM builder creates duplicate BUILDING→FLOOR references and the compiler walks the floor BOM once per duplicate — producing extra elements (S57 finding: RA +31%, JE +29%, WA +220%, MO +2%). When onboard_ifc.sh generates a YAML with colliding codes, disambiguate them before committing (e.g. L1 → L1A/L1B, or merge both storeys into one entry if they are architecturally the same floor).
floor_rooms (optional)¶
Parsed by ClassificationYaml.java:110. Consumed by ScopeBomBuilder.java (scope assignment) and FloorRoomBomBuilder.java (room BOM creation).
Defines room/scope space structure per storey. Two modes:
IFC-driven (preferred): elements assigned by IfcRelContainedInSpatialStructure
from the extraction DB. YAML maps IFC space names to BOM templates:
floor_rooms:
Ground Floor:
bom_id: FLOOR_SH_GF_STD
product_category: GF
spaces:
- { ifc_space: "1 - Living room", template_bom: SH_LIVING_SET, role: LIVING, seq: 10 }
- { ifc_space: "2 - Bedroom", template_bom: SH_BED_SET, role: MASTER, seq: 30 }
Scope box (Order processing only): for sub-room zone subdivision at order time (BIM Designer GUI, BOM Drop). Not used during IFCtoBOM extraction.
# Order-time sub-division (not extraction):
spaces:
- { name: DINING, template_bom: SH_DINING_SET, role: DINING, seq: 20,
aabb_mm: [2500, 1500, 1300], origin_m: [-6.5, -0.3, 0.0] }
| Space field | Description |
|---|---|
ifc_space |
IFC IfcSpace name (extraction: rel_contained_in_space) |
name |
Scope space name (Order processing fallback) |
template_bom |
BOM ID for furniture/fixture template |
role |
Role string on the LEAF child |
seq |
Sequence number |
aabb_mm |
Scope box dimensions in mm (Order processing only) |
origin_m |
Scope box origin in metres (Order processing only) |
IFC-driven: elements assigned by IfcSpace containment from extraction DB.
Scope box: elements assigned by centroid-in-box test at order time.
static_children (optional)¶
Parsed by ClassificationYaml.java:151. Consumed by FloorRoomBomBuilder.java which inserts MAKE children into the BUILDING BOM.
Fixed MAKE children added to the BUILDING BOM (slabs, roof, MEP trunk, pair container).
static_children:
- { child_product_id: FLOOR_SLAB_GF, role: GROUND_SLAB, seq: 5, dz: 0.0 }
| Field | Description |
|---|---|
child_product_id |
BOM ID of the child assembly |
role |
Role string on the MAKE child |
seq |
Sequence number |
dz |
Vertical offset in metres |
composition (optional)¶
Parsed by ClassificationYaml.java:165. Consumed by CompositionBomBuilder.java which runs the three-tier mirror partition algorithm.
Defines how a building is composed from repeated units.
composition:
type: MIRRORED_PAIR
pair_bom_id: DUPLEX_SET_STD
half_unit_bom_id: DUPLEX_SINGLE_UNIT_STD
mirror:
axis: X
position: 4.4
rotation: 3.141592653589793
| Field | Description |
|---|---|
type |
Composition type: MIRRORED_PAIR (only one implemented) |
pair_bom_id |
BOM ID for the pair container (SET) |
half_unit_bom_id |
BOM ID for each half-unit (FLOOR) |
mirror.axis |
Partition axis: X, Y, or Z |
mirror.position |
Mirror plane position in world coords (party wall center) |
mirror.rotation |
B-side rotation in radians (pi = 180 degrees) |
See docs/DuplexAnalysis.md for the three-tier partition algorithm.
How to Add a New Building¶
Step 0 — Ensure IFC element types are registered (one-time)¶
extract.py reads its list of supported IFC classes from the authority table
ad_ifc_class_map in library/ERP.db. If the new IFC file contains
element types not in that table, they will be silently skipped.
Check coverage:
# List IFC types in the file
python3 -c "import ifcopenshell; f=ifcopenshell.open('source.ifc'); print(sorted(set(e.is_a() for e in f.by_type('IfcElement'))))"
# Compare with registered types
sqlite3 library/ERP.db "SELECT ifc_class FROM ad_ifc_class_map WHERE is_active=1 ORDER BY ifc_class"
Add missing types:
INSERT INTO ad_ifc_class_map
(ifc_class, discipline, category, attachment_face, ifc_schema, domain, description)
VALUES ('IfcNewType', 'ARC', 'NEW_CATEGORY', 'BOTTOM', 'IFC4', 'BUILDING', 'Description');
Zero code changes. See DISC_VALIDATION_DB_SRS.md §5.2 for the full schema.
Step 1 — Extract geometry from IFC (Python, one-time)¶
Use IfcOpenShell to extract element metadata + geometry into a reference DB.
See tools/extract.py for the extraction script.
python3 tools/extract.py --to reference source.ifc \
-o DAGCompiler/lib/input/MyBuilding_extracted.db
Output: DAGCompiler/lib/input/MyBuilding_extracted.db containing:
- elements_meta — element names, IFC classes, storey assignments
- elements_rtree — bounding boxes (AABB min/max per axis)
- element_instances — geometry hashes per element
- base_geometries — mesh blobs (vertices + faces)
What happens next (automatic, inside the Java pipeline):
When you run the pipeline (Step 5), ExtractionPopulator.java reads this reference DB and populates library/component_library.db with:
| Table | Purpose | Reused? |
|---|---|---|
I_Element_Extraction |
Per-building element metadata with M_Product_ID = element_ref |
Rebuilt per run |
I_Geometry_Map |
Element → geometry_hash links | INSERT OR IGNORE |
component_geometries |
Mesh blobs (vertices + faces) | INSERT OR IGNORE (shared across buildings) |
M_Product |
Persistent product catalog — reused across buildings | INSERT OR IGNORE |
M_Product_Image |
Product → geometry_hash canonical link | INSERT OR IGNORE |
component_library.db is the master catalog. Products created for one building are automatically reused by subsequent buildings if the same product_id appears.
Schema docs: DATA_MODEL.md §Reference DB.
ERD: bim_architecture_viz.html.
Step 2 — Inspect the extracted data¶
Query the reference DB to understand storey names, element counts, and IFC classes:
# List storeys and element counts
sqlite3 DAGCompiler/lib/input/MyBuilding_extracted.db \
"SELECT storey, COUNT(*) FROM elements_meta GROUP BY storey"
# List IFC classes and counts
sqlite3 DAGCompiler/lib/input/MyBuilding_extracted.db \
"SELECT ifc_class, COUNT(*) FROM elements_meta GROUP BY ifc_class ORDER BY COUNT(*) DESC"
# Check for mirror symmetry (duplex/row house)
sqlite3 DAGCompiler/lib/input/MyBuilding_extracted.db \
"SELECT MIN(r.minX), MAX(r.maxX), MIN(r.minY), MAX(r.maxY) FROM elements_rtree r"
These storey names must appear as keys in the YAML storeys: section.
For mirror buildings, identify the party wall position — see DuplexAnalysis.md.
Step 3 — Write the classification YAML (only invention step)¶
Create IFCtoBOM/src/main/resources/classify_{prefix}.yaml.
Copy from an existing YAML and adapt:
- classify_sh.yaml — simple building (no composition)
- classify_dx.yaml — mirrored pair (duplex)
Key fields to set:
- building_type — must match the reference DB filename (without _extracted.db)
- prefix — short code (2–3 chars), used for {prefix}_BOM.db
- storeys — one entry per storey name from step 2
- composition — add if the building has mirrored/repeated units
YAML is parsed by ClassificationYaml.java.
See Schema (v1) above for field reference.
Step 4 — Write the BIM COBOL DSL script¶
Create IFCtoBOM/src/main/resources/dsl_{prefix}.bim.
This script tells the DAGCompiler how to walk the BOM and emit elements.
Copy from an existing DSL:
- dsl_sh.bim — simple building
- dsl_dx.bim — duplex with mirror
Reference the DSL filename in the YAML: dsl_file: dsl_{prefix}.bim.
Verb reference: BIM_COBOL.md.
Compiler internals: SourceCodeGuide.md, BOMBasedCompilation.md.
Step 5 — Build the BOM (*_BOM.db)¶
rm -f library/{PREFIX}_BOM.db
./scripts/run_RosettaStones.sh classify_{prefix}.yaml
The shell script (run_RosettaStones.sh) calls
IFCtoBOMMain.java which runs
IFCtoBOMPipeline.java —
the single-transaction orchestrator that produces library/{PREFIX}_BOM.db:
| Pipeline step | Code | Writes to | What it does |
|---|---|---|---|
| 1. Load YAML | ClassificationYaml.load() |
— | Parses the classification YAML into config records |
| 2. Create schema | IFCtoBOMPipeline:234 |
*_BOM.db |
Creates m_bom, m_bom_line, ad_sysconfig tables (recipe + integrity hash) |
| 3. Extract | ExtractionPopulator.populate() |
component_library.db |
Reference DB → I_Element_Extraction, sets M_Product_ID = element_ref, imports missing geometry blobs |
| 4. Read extraction | ExtractionReader.readByStorey() |
— | Reads I_Element_Extraction grouped by storey. FAIL if NULL M_Product_ID |
| ↳ Pre-flight | IFCtoBOMPipeline |
— | Storeys auto-discovered from extraction Z-bands (P127). YAML storeys: is optional override |
| 5a. Product catalog | ProductRegistrar.ensureProductCatalog() |
component_library.db |
Creates M_Product in persistent catalog. INSERT OR IGNORE = reuse across buildings |
| 5b. Product images | ProductRegistrar.ensureProductImages() |
component_library.db |
Joins M_Product × I_Geometry_Map (on product_id = element_ref, filtered by building_type) → M_Product_Image |
| ↳ Pre-flight | IFCtoBOMPipeline |
— | FAIL if any product has no geometry_hash |
| ~~5c. Copy products~~ | ~~ProductRegistrar.ensureProducts()~~ |
~~*_BOM.db~~ |
DEAD CODE (R7): BOMWalker reads M_Product from component_library.db via compConn. Copy to BOM DB is no longer needed — pending removal |
| 6. Scope spaces | ScopeBomBuilder.build() |
*_BOM.db |
Assigns elements to rooms via IFC rel_contained_in_space → SET BOMs. Scope box fallback for buildings without IfcSpace data (P125) |
| 7. Composition | CompositionBomBuilder.build() |
*_BOM.db |
Mirror partition → half-unit LEAF lines + pair container (2 children) |
| 8. Structural | StructuralBomBuilder.build() |
*_BOM.db |
BUILDING BOM header + FLOOR STR BOMs with element LEAF lines + MAKE children. Reads rel_aggregates for IFC assembly BOMs (P129) |
| 9. Room BOMs | FloorRoomBomBuilder.build() |
*_BOM.db |
Static children from YAML + room template LEAF refs |
| 10. QA gate | BomValidator.validateAndReport() |
— | Pre-commit validation: FAIL → rollback, broken data never reaches disk |
| 11. Commit | IFCtoBOMPipeline |
*_BOM.db |
Integrity hash + commit transaction |
Output:
- library/{PREFIX}_BOM.db — per-building factored recipe: m_bom (BOM headers),
m_bom_line (type lines — one per unique product per parent BOM, with qty and verb
formula reference). The compiler expands type lines to placement instances at compile
time. {PREFIX}_BOM.db is a recipe, not a placement map — see BOMBasedCompilation.md §2.1.6.
Should contain only m_bom + m_bom_line + ad_sysconfig (integrity hash).
No M_Product — product definitions live in component_library.db (master catalog)
- library/component_library.db — master product catalog (source of truth):
M_Product (definitions), M_Product_Image (geometry links, orientation),
I_Element_Extraction (element metadata), component_geometries (mesh blobs)
The BOM DB references products by ID. The library is the source of truth for product definitions, geometry, and orientation. Products are reused across buildings.
BOM data model: BOMBasedCompilation.md.
ERP context (C_Order, BOM decisions): BBC.md §1.
Schema reference: DATA_MODEL.md.
Step 6 — Compilation and delta verification¶
The same run_RosettaStones.sh invocation continues after BOM creation.
Compilation runs the 12-stage pipeline (BOMBasedCompilation.md §5):
| Step | Code | What it does |
|---|---|---|
| Prepare compile DB | run_RosettaStones.sh |
Copies *_BOM.db → temp _XX_compile.db |
| Compile (12 stages) | CompilationPipeline.java |
Metadata → Compile → Write → Route → Verb → Digest → Geometry → Prove (+ 4 internal stages) |
| Contracts | RosettaStoneGateTest.java |
G1-G6 gate tests |
| Rule 8 | run_RosettaStones.sh |
All M_BOM_Line offsets within parent AABB envelope |
| Clash check | run_RosettaStones.sh |
0 furniture AABB overlaps |
| C8 Diversity | run_RosettaStones.sh |
Per-instance mesh uniqueness preserved |
| C9 Axis | run_RosettaStones.sh |
W/D/H match per axis vs reference |
Expected result: All checks PASS.
Compilation internals: SourceCodeGuide.md, BOMBasedCompilation.md §4.
Test architecture: TestArchitecture.md.
Step 7 — Mine validation rules from the Rosetta Stone¶
After 10/10 PASS, the output DB contains observed patterns that become validation rules. This is the same mining approach used for Terminal (NFPA13 sprinkler spacing from 48K elements).
7a. Query the output DB for patterns:
# Structural dimensions per (ifc_class, segment)
sqlite3 DAGCompiler/lib/output/{building_type}.db "
SELECT em.ifc_class, em.storey, COUNT(*) as cnt,
ROUND(AVG((r.maxX-r.minX)*1000)) as avg_W_mm,
ROUND(AVG((r.maxY-r.minY)*1000)) as avg_D_mm,
ROUND(AVG((r.maxZ-r.minZ)*1000)) as avg_H_mm
FROM elements_meta em JOIN elements_rtree r ON em.id = r.id
GROUP BY em.ifc_class, em.storey HAVING COUNT(*) > 1
ORDER BY cnt DESC" -header -column
7b. Write a migration script (migration/DV00N_*.sql):
INSERT OR IGNORE INTO AD_Val_Rule
(rule_id, rule_name, discipline, rule_type, description, mining_source, is_active)
VALUES ('MY_RULE', 'Description', 'STR', 'DIMENSION', 'Details', 'Source_Building', 1);
INSERT OR IGNORE INTO AD_Val_Rule_Param
(rule_id, param_name, param_value, unit, description)
VALUES ('MY_RULE', 'width_mm', '3499', 'mm', 'Column width');
7c. Apply:
sqlite3 library/ERP.db < migration/DV00N_my_rules.sql
Rule types: DIMENSION (element W×D×H), RATIO (cross-element proportion),
MIN_DIMENSION (safety minimum), MIN_COUNT (regulatory), Z_CONTINUITY (stacking).
Full mining methodology: SourceCodeGuide.md §Chapter 4, Step 5.
Bridge rules: InfrastructureAnalysis.md §7.1.
Existing migration: migration/DV006_infra_bridge_rules.sql (13 rules, 29 params).
Step 8 — Troubleshoot¶
| Symptom | Cause | Fix |
|---|---|---|
Unmapped storey warning |
Storey name in ref DB not in YAML | Add the storey key to storeys: |
NULL M_Product_ID warning |
Should not happen with ExtractionPopulator | Check reference DB has elements_meta rows |
No geometry for ... error |
Reference DB missing mesh for some elements | Check element_instances table in reference DB |
QA FAIL: Product-linked LEAF lines |
NULL child_product_id on leaf |
Check I_Element_Extraction.M_Product_ID |
| Delta count mismatch | Composition pairing issue | Check mirror position matches party wall center |
QA architecture: TestArchitecture.md.
ERP model context: MANIFESTO.md.
What NOT to Do¶
- Do NOT write manual SQL migrations for M_Product_ID —
ExtractionPopulatordoes this - Do NOT edit
I_Element_Extractionmanually — it is regenerated every pipeline run - Do NOT hardcode element_ref → product mappings —
M_Product_ID = element_refis automatic - Do NOT create per-building Python scripts — the Java pipeline is building-agnostic
Drift Prevention — What the Pipeline Enforces¶
The pipeline has runtime guards that FAIL (abort + rollback) on broken data. Every guard runs automatically on every build — no human memory required.
Enforced Guards (FAIL = pipeline aborts)¶
| Guard | Location | What It Catches |
|---|---|---|
NULL M_Product_ID |
ExtractionReader |
Broken extraction → unlinked BOM leaves |
NULL child_product_id on LEAF |
BomValidator |
BOMWalker silent skip → 0 placements |
Missing element_ref on LEAF |
BomValidator |
G5-PROVENANCE can't trace to library |
| Extraction reconciliation | BomValidator |
LEAFs + paired != extraction count → silent element loss |
| Unmapped storey in extraction | IFCtoBOMPipeline |
Storey not in YAML → elements silently dropped |
| Geometry completeness | IFCtoBOMPipeline |
Products without geometry_hash → 0 placements |
| World-coord offsets (>500m) | BomValidator |
Hardcoded world coordinates in dx/dy/dz. Note: This checks parent-relative offsets, not absolute coords. Infrastructure elements with UTM georeferencing are safe — their parent-relative offsets are bounded (~80m max). See InfrastructureAnalysis.md §3.1 G6. |
| BUILDING count != 1 | BomValidator |
Multiple or zero root BUILDING BOMs. Infrastructure IFCs with multiple facilities must be extracted per-facility to satisfy this guard. See InfrastructureAnalysis.md §2.4. |
| Orphan BOM lines | BomValidator |
Child references non-existent parent |
| AABB envelope violation | BomValidator |
Floor AABB exceeds building |
| Schema version mismatch | ClassificationYaml |
YAML declares v2 but parser is v1 |
| GUID ordinal uniqueness | PlacementCollectorVisitor |
Always ++ordinalCounter — stored BOM ordinals never used for GUIDs (collision trap) |
Advisory Guards (reported, does not block)¶
| Guard | Location | What It Reports |
|---|---|---|
| Verb expansion fidelity | BomValidator (step 9b) |
Expands each verb_ref, compares world centroids against original extraction. Max/avg error per verb. TILE/ROUTE should be ≤5mm, SPRAY advisory. |
| Factorization ratio | BomValidator |
WARN if >10× lines/products (TE: 2.6×, healthy) |
| Duplicate positions | BomValidator |
Same product at same dx/dy/dz (WARN, not FAIL) |
What the Pipeline Does NOT Validate¶
These are documented ASSUMPTION remarks in the code — comment-only, no runtime guard:
- Scope box coordinate frame stability — For buildings still using scope box fallback
(no IFC spatial data),
origin_mis assumed to match extraction centroids. If IFC is re-extracted with a differentIfcMapConversionoffset, scope box containment silently breaks. IFC-driven buildings (ifc_space:) are immune. (ScopeBomBuilder ASSUMPTION) - Composition geometric validity — Mirror pairing matches by product count per storey, not by geometric spatial mirroring. (CompositionBomBuilder ASSUMPTION)
- Cross-discipline product_id uniqueness — If two disciplines have elements with the same stripped name (e.g. both ARC and ACMV have "Window_01"), they collapse to one M_Product. No cross-discipline collision check exists.
- Infrastructure IFC4X3 spatial containers —
IfcRoad,IfcBridge,IfcRailwayuseIfcFacilityPartinstead ofIfcBuildingStorey. The Python extraction layer (get_storey_for_element()) already handles this (DONE 2026-03-16). The Java spatial structure extraction (extract_from_ifc_to_reference()) needs extension to extract IfcRoad/IfcBridge/IfcRailway intospatial_structuretable. SeeInfrastructureAnalysis.md§3.1 G10. - Discipline stratification — The
disciplines:section in YAML (e.g. classify_te.yaml) is declared but not parsed by schema v1. TE gets storey-level structural BOMs only.
Adding a New Building or Facility — Pre-flight Checklist¶
Before first pipeline run with a new classify_*.yaml:
- LOD population (one-time):
python3 tools/extract.py --to library source.ifc --classes ...This populates component_library.db with geometry for the new element types (INSERT OR IGNORE). - Reference extraction:
python3 tools/extract.py --to reference source.ifc -o DAGCompiler/lib/input/{BuildingType}_extracted.db - Query the reference DB for segments:
sqlite3 ...extracted.db "SELECT storey, COUNT(*) FROM elements_meta GROUP BY storey" - Write
classify_{prefix}.yamlwith every segment name as a key instoreys:(buildings) orsegments:(infrastructure). Pipeline will FAIL if any are missing. - Run pipeline:
./scripts/run_RosettaStones.sh classify_{prefix}.yaml - The pipeline automatically:
- Populates
I_Element_Extractionin component_library.db (ExtractionPopulator) - Creates products in component_library.db catalog (INSERT OR IGNORE = reuse)
- Links products to geometry (M_Product_Image)
- Check QA report: extraction reconciliation PASS = every element accounted for
- Check for "products reused from catalog" message — confirms cross-building reuse is working
For infrastructure IFCs, also see InfrastructureAnalysis.md §9 for the phased extraction path.
Schema v3 (planned): MEP Rules-Based Laying¶
Status: SRS — not yet implemented. See docs/G4_SRS.md, docs/TE_MINING_RESULTS.md.
The ProcessIt() Pattern¶
In iDempiere, MOrder.processIt() fires the document engine — tax calculation,
inventory reservation, accounting. The user fills in the order lines, clicks
"Process", and the engine applies all business rules automatically.
The BIM Designer follows the same pattern:
User action: Engine response:
───────────── ─────────────────
Define space (room AABB) → C_OrderLine created in output.db
Set MEP = true in YAML → Discipline flags on building config
Click "Compile It" (ProcessIt) → ConstructionModelSpawner + PlacementValidator
→ AD_Val_Rule fires per discipline
→ MEP elements placed by mined rules
→ Clearance checked (ERP-maths, not mesh)
mep Section (schema v3)¶
building:
building_type: MyHouse_2BR
prefix: MH
# ... existing fields ...
mep:
enabled: true # triggers MEP auto-population on ProcessIt()
jurisdiction: MY # drives AD_Val_Rule selection
disciplines:
FP:
enabled: true
occupancy_class: LH # NFPA 13 Light Hazard
# Rules auto-applied from AD_Val_Rule:
# NFPA13_LH_SPACING: min=3000mm, max=4600mm
# FP branch pipe max: ≤12000mm per run
ELEC:
enabled: true
# Rules auto-applied:
# IES_LIGHT_SPACING: max=5000mm
# NEC_ELEC_SP_CLEARANCE: min=150mm from SP
SP:
enabled: true
# Rules auto-applied:
# NEC_ELEC_SP_CLEARANCE: min=150mm from ELEC
How Rules Drive Placement¶
The compiler does NOT hardcode "sprinklers go every 4m." It reads
AD_Val_Rule parameters mined from Rosetta Stones:
1. User defines room: AABB = 8000 × 6000 × 3000mm
2. YAML says mep.FP.enabled = true, occupancy_class = LH
3. ProcessIt() → ConstructionModelSpawner:
a. SELECT * FROM AD_Val_Rule WHERE discipline='FP' AND jurisdiction='MY'
b. Rule NFPA13_LH_SPACING: min=3000, max=4600, typical=3500
c. Compute grid pitch:
pitch = min(max_spacing, room_dim / ceil(room_dim / typical_spacing))
X: min(4600, 8000/ceil(8000/3500)) = min(4600, 2667) = 2667mm → 3 cols
Y: min(4600, 6000/ceil(6000/3500)) = min(4600, 3000) = 3000mm → 2 rows → 6 heads
(typical_spacing is the observed dominant pitch from mining, not the code max)
d. INSERT 4 C_OrderLine (sprinkler heads) with tack dx/dy/dz
e. PlacementValidator checks each placement against AD_Val_Rule
4. Cross-discipline check (Tier 2):
a. ERP-maths clearance: centreline distance - cross-section radii
b. Uses M_Product dimensions (pipe diameter), not mesh geometry
c. NEC_ELEC_SP_CLEARANCE: flag any pair < 150mm
The key insight: Rules are DATA (AD_Val_Rule rows mined from TE/DX), not CODE. Adding a new jurisdiction = INSERT new rule rows. Adding a new discipline = INSERT new rules. Zero Java changes. Same pattern as iDempiere tax tables — rates are data, the tax engine is generic.
Clearance via ERP Maths (not Bonsai geometry)¶
Cross-discipline clearance uses product dimensions from the BOM, not viewport mesh geometry. This sidesteps the Bonsai dependency entirely:
clearance = centreline_2D_distance - radius_a - radius_b
where:
radius = MIN(width, depth) / 2 ← pipe cross-section from M_Product
Verified against TE: 48K elements, 11 true overlaps, 35 under 150mm.
See docs/TE_MINING_RESULTS.md §M12 for full results.
This means clearance checking works: - At compile time (Rosetta Stone verification) - At design time (BIM Designer ambient compliance — no Blender needed) - At batch time (SQL reports against output.db)
When Bonsai viewport is available (Phase G-8 BlenderBridge), the same check can optionally use mesh-level precision — but the ERP-maths version is the default, always-available baseline.
Relationship to Existing Schema¶
| Schema version | What it adds | Depends on |
|---|---|---|
| v1 (current) | building, storeys, floor_rooms, static_children | — |
| v2 (TE) | disciplines section (ifc_class → bom_category map) | v1 |
| v2+ (infra) | segments: alias for storeys:, M_Product_Category=IN, infrastructure discipline map |
v2. See InfrastructureAnalysis.md §3 |
| v3 (planned) | mep section (rules-based MEP auto-population) | v2 + AD_Val_Rule + output.db |
§Terrain — Infrastructure Placement on Terrain¶
Infrastructure elements are placed relative to a terrain surface, not a storey floor.
The Designer treats terrain as a placement context — same abstract contract as a
room container, but with variable Z. Full technical details:
InfrastructureAnalysis.md §8.
Outline Steps: Terrain-Aware Infrastructure Design¶
Step 1: IMPORT TERRAIN
Source: Federation pdf_terrain addon → survey_highres_extracted.json
Format: ground_elevations[] with pixel x/y + z elevation (metres)
Transform: world_x = px × scale, world_y = (img_h - py) × scale
Result: AlignmentContext with 689 survey points, elevationAt(x,y)
Step 2: SELECT FACILITY TYPE
User picks: BRIDGE / ROAD / RAILWAY / TUNNEL from facility dropdown
API: listFacilityTypes() → FacilityType enum
Effect: loads provenance-scoped validation rules (30 infra rules)
YAML: M_Product_Category=IN, segments: alias for storeys:
Step 3: DEFINE ALIGNMENT
User draws polyline over terrain in viewport
Each vertex gets Z from terrain.elevationAt(x,y)
Result: AlignmentContext with station points along centreline
Corridor width: road=7300mm, rail=5000mm, bridge=12000mm (from YAML)
Step 4: PLACE SEGMENTS
Auto-generate from Rosetta Stone BOM pattern:
Bridge: ABT → PIR → DCK → SUP → APR (from classify_br.yaml)
Road: CW × 4 + PKG (from classify_rd.yaml)
Rail: TRK (from classify_rl.yaml)
Each segment bbox placed along alignment
Step 5: TERRAIN SNAP (interactive drag)
User drags element → Z follows terrain via TerrainSnap mode:
ON_SURFACE: road layers, sleepers (Z = terrain + offset)
ABOVE: bridge deck (Z = terrain + clearance)
BELOW: tunnel, pipeline (Z = terrain - cover - height)
PIER: bridge pier, abutment (Z = terrain, extends up)
Wireframe bboxes shown during drag — flows along terrain
Step 6: LAYER STACKING (road MAKE path)
Road pavement stacks 4 layers on terrain:
subgrade (250mm) → base (120mm) → binder (80mm) → surface (40mm)
Each layer Z = terrain + cumulative offset below
Same pattern as Assembly Builder wall layers
Step 7: VALIDATE (snap + rules)
snap(bboxes, "", gridMm, "ROAD") → loads road validation rules
Each element checked: width_mm, depth_mm, height_mm, thickness_mm
BLOCK/PASS verdicts per element — same UX as building validation
Step 8: ADJUST OFFSETS (engineering controls)
User adjusts: fill height, cut depth, clearance, cover
Re-snap → re-validate → iterate until compliant
Gradient check: compare Z at consecutive stations
Step 9: CO SAVE (incremental)
On save: wireframe bboxes → actual geometry in output DB
Shape updates incrementally as each element resolves
output.db stores compile state; design state saved in .blend file
Terrain Data Contract¶
The terrain JSON from Federation is the input contract:
| Field | Type | Unit | Description |
|---|---|---|---|
ground_elevations[].x |
float | pixels | Image X coordinate |
ground_elevations[].y |
float | pixels | Image Y coordinate |
ground_elevations[].z |
float | metres | Ground elevation (ASL) |
metadata.scale |
float | m/pixel | Pixel-to-world scale factor |
metadata.image_dimensions.height |
int | pixels | Image height (for Y flip) |
Java reads this via AlignmentContext(List<StationPoint>, corridorWidthMm).
Python writes it via BIM_OT_pdf_terrain_generate operator.
Same IfcOpenShell-writes / Java-reads contract as all Federation PoCs.
Further Reading¶
Architecture & Concepts¶
| Topic | Document |
|---|---|
| ERP model (C_Order, BOM, decisions) | MANIFESTO.md |
| Spatial MRP (construction as ERP II) | ConstructionAsERPII.txt |
| BOM compilation, tack §4 | BOMBasedCompilation.md |
| BIM as BOM concept | MANIFESTO.md §The Pattern |
| Conceptual blueprint | CONCEPTUAL BLUEPRINT.txt |
| Rosetta Stone strategy | TheRosettaStoneStrategy.md |
| BIM Designer vision | BIM_Designer.md |
Data Model & Schema¶
| Topic | Document |
|---|---|
| Schema, tables, I_Element_Extraction | DATA_MODEL.md |
| ERD (interactive HTML) | bim_architecture_viz.html |
| Terminal ERD | terminal_erd.html |
Source Code & Development¶
| Topic | Document |
|---|---|
| Source code walkthrough | SourceCodeGuide.md |
| DAO, ORM, build instructions | SourceCodeGuide.md |
| BIM COBOL verbs (77 verbs) | BIM_COBOL.md |
| Prefab architecture | PREFAB_ARCHITECTURE.md |
| Validation rules | DocValidate.md |
QA & Testing¶
| Topic | Document |
|---|---|
| Test architecture, tamper seal | TestArchitecture.md |
| Current state, gate status | ../PROGRESS.md |
| Roadmap (phases 0–H) | ACTION_ROADMAP.md |
Building-Specific Analysis¶
| Building | Document |
|---|---|
| DX mirror forensics | DuplexAnalysis.md |
| TE ERP architecture | TerminalAnalysis.md |
| SH data model | DATA_MODEL.md |
Appendix — Adding New IFC Files¶
Most users will never need this. The 35 onboarded buildings and their products are already in the repository. This section is for contributors who want to extend the system with new IFC source files.
One command handles the entire process — extraction, YAML generation, BOM creation, compilation, testing, and validation rule mining:
./scripts/onboard_ifc.sh \
--prefix XX --type BuildingType \
--name "Human Name" --base RE \
--ifc DAGCompiler/lib/input/IFC/source.ifc
This runs 8 steps: recon → extract → generate YAML/DSL → register manifest → register GATE_SCOPE → pipeline (populate + IFCtoBOM + compile) → extract validation rules → report.
After onboarding, run_RosettaStones.sh includes the new building automatically.
Review the generated classify_XX.yaml and commit.
Full guide: IFC_ONBOARDING_RUNBOOK.md.