MPP CodeGraph
A Kotlin Multiplatform library for parsing source code and building code graphs using TreeSitter.
Build Status
✅ Build Successful - All tests passing
Quick Commands
# Build the module
./gradlew :mpp-codegraph:build
# Run all tests
./gradlew :mpp-codegraph:allTests
# Run JVM tests only
./gradlew :mpp-codegraph:jvmTest
# Run JS tests only
./gradlew :mpp-codegraph:jsTest
Overview
MPP CodeGraph provides a unified API for parsing source code across different platforms (JVM and JS) using TreeSitter parsers. It extracts code structure information (classes, methods, fields, etc.) and relationships (inheritance, composition, dependencies) to build a comprehensive code graph.
Features
- Multiplatform Support: Works on JVM and JavaScript platforms
- TreeSitter-based Parsing: Uses TreeSitter for accurate and fast parsing
- Language Support:
- Fully Tested: Java, JavaScript, TypeScript, Python
- Experimental: Kotlin, C#, Go, Rust
- Code Graph Model: Unified data model for code nodes and relationships
- Type-safe API: Kotlin-first design with full type safety
Architecture
Common Code (commonMain)
The common code defines the core data models and interfaces:
-
Model Classes:
CodeNode: Represents a code element (class, method, field, etc.)CodeRelationship: Represents relationships between code elementsCodeGraph: Container for nodes and relationshipsCodeElementType: Enum for different code element typesRelationshipType: Enum for different relationship types
-
Parser Interface:
CodeParser: Common interface for parsing codeLanguage: Enum for supported programming languages
JVM Implementation (jvmMain)
Uses TreeSitter Java bindings from io.github.bonede:
-
Dependencies:
tree-sitter:0.25.3tree-sitter-java:0.23.4tree-sitter-kotlin:0.3.8.1tree-sitter-c-sharp:0.23.1tree-sitter-javascript:0.23.1tree-sitter-python:0.23.4
-
Implementation:
JvmCodeParser: JVM-specific parser implementation- Based on SASK project architecture
JS Implementation (jsMain)
Uses web-tree-sitter for browser and Node.js:
-
Dependencies:
web-tree-sitter:0.22.2@unit-mesh/treesitter-artifacts:1.7.4
-
Implementation:
JsCodeParser: JavaScript-specific parser implementation- Based on autodev-workbench architecture
Usage
Basic Usage
import cc.unitmesh.codegraph.CodeGraphFactory
import cc.unitmesh.codegraph.parser.Language
// Create a parser instance
val parser = CodeGraphFactory.createParser()
// Parse a single file
val sourceCode = """
package com.example;
public class HelloWorld {
public void sayHello() {
System.out.println("Hello");
}
}
""".trimIndent()
val nodes = parser.parseNodes(sourceCode, "HelloWorld.java", Language.JAVA)
// Parse multiple files and build a graph
val files = mapOf(
"HelloWorld.java" to sourceCode1,
"Greeter.java" to sourceCode2
)
val graph = parser.parseCodeGraph(files, Language.JAVA)
// Query the graph
val classes = graph.getNodesByType(CodeElementType.CLASS)
val relationships = graph.getRelationshipsByType(RelationshipType.MADE_OF)
Platform-Specific Usage
JVM
import cc.unitmesh.codegraph.parser.jvm.JvmCodeParser
val parser = JvmCodeParser()
val nodes = parser.parseNodes(sourceCode, filePath, Language.JAVA)
JavaScript/Node.js
import cc.unitmesh.codegraph.parser.js.JsCodeParser
val parser = JsCodeParser()
parser.initialize() // Initialize TreeSitter
val nodes = parser.parseNodes(sourceCode, filePath, Language.JAVASCRIPT)
Building
Build All Platforms
./gradlew :mpp-codegraph:build
Build JVM Only
./gradlew :mpp-codegraph:jvmTest
Build JS Only
./gradlew :mpp-codegraph:jsTest
Assemble JS Package
./gradlew :mpp-codegraph:assembleJsPackage
Testing
Run tests for all platforms:
./gradlew :mpp-codegraph:allTests
Run JVM tests only:
./gradlew :mpp-codegraph:jvmTest
Run JS tests only:
./gradlew :mpp-codegraph:jsTest
Version Information
TreeSitter Versions
JVM (io.github.bonede):
- tree-sitter: 0.25.3
- tree-sitter-java: 0.23.4
- tree-sitter-kotlin: 0.3.8.1
- tree-sitter-csharp: 0.23.1
- tree-sitter-javascript: 0.23.1
- tree-sitter-python: 0.23.4
JS (npm packages):
- web-tree-sitter: 0.22.2
- @unit-mesh/treesitter-artifacts: 1.7.4
- tree-sitter-java: 0.21.0
- tree-sitter-kotlin: 0.3.8
- tree-sitter-c-sharp: 0.20.0
Design Principles
- Platform Abstraction: Common interfaces with platform-specific implementations
- Consistent API: Same API across all platforms
- Version Alignment: TreeSitter versions aligned with reference projects (SASK and autodev-workbench)
- Type Safety: Full Kotlin type safety with serializable models
- Extensibility: Easy to add new languages and relationship types
References
- SASK Project: JVM implementation reference
- autodev-workbench: JS implementation reference
- TreeSitter: https://tree-sitter.github.io/tree-sitter/
License
MIT License