Convert Figma logo to code with AI

github logosemantic

Parsing, analyzing, and comparing source code across many languages

8,985
453
8,985
114

Top Related Projects

Integrate cutting-edge LLM technology quickly and easily into your apps

11,750

An open-source NLP research library, built on PyTorch.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

30,447

💫 Industrial-strength Natural Language Processing (NLP) in Python

30,331

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

7,314

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Quick Overview

GitHub Semantic is an open-source library that provides a powerful semantic code analysis tool. It uses machine learning and natural language processing techniques to analyze and understand code across multiple programming languages, enabling advanced code search, navigation, and refactoring capabilities.

Pros

  • Supports multiple programming languages, including JavaScript, TypeScript, Python, and more
  • Provides accurate and context-aware code analysis
  • Enables advanced code search and navigation features
  • Can be integrated into various development tools and workflows

Cons

  • Requires significant computational resources for large codebases
  • May have a steep learning curve for advanced usage
  • Documentation could be more comprehensive for some features
  • Limited community support compared to some other code analysis tools

Code Examples

  1. Parsing a TypeScript file:
import { parseTypeScriptFile } from '@github/semantic'

const ast = await parseTypeScriptFile('path/to/file.ts')
console.log(ast)
  1. Performing a semantic search:
import { semanticSearch } from '@github/semantic'

const results = await semanticSearch('function that calculates fibonacci', 'path/to/codebase')
console.log(results)
  1. Extracting function definitions:
import { extractFunctions } from '@github/semantic'

const functions = await extractFunctions('path/to/file.js')
functions.forEach(func => console.log(func.name, func.parameters))

Getting Started

To get started with GitHub Semantic, follow these steps:

  1. Install the library:

    npm install @github/semantic
    
  2. Import and use the desired functions in your project:

    import { parseTypeScriptFile, semanticSearch } from '@github/semantic'
    
    // Use the imported functions as needed
    const ast = await parseTypeScriptFile('path/to/file.ts')
    const searchResults = await semanticSearch('query', 'path/to/codebase')
    
  3. Refer to the documentation for more advanced usage and configuration options.

Competitor Comparisons

Integrate cutting-edge LLM technology quickly and easily into your apps

Pros of Semantic Kernel

  • More active development with frequent updates and contributions
  • Broader scope, focusing on integrating AI capabilities into various applications
  • Extensive documentation and examples for easier adoption

Cons of Semantic Kernel

  • Steeper learning curve due to its comprehensive nature
  • Heavier dependency on external AI services, potentially increasing costs
  • Less focused on specific code analysis tasks compared to Semantic

Code Comparison

Semantic:

parseModule :: Parser Module
parseModule = do
  header <- optional moduleHeader
  imports <- many importDecl
  decls <- many topDecl
  return $ Module header imports decls

Semantic Kernel:

public class SemanticFunction
{
    public string Name { get; set; }
    public string Description { get; set; }
    public ISKFunction Function { get; set; }
    public List<ParameterView> Parameters { get; set; }
}

Summary

Semantic focuses on code analysis and parsing, while Semantic Kernel offers a broader toolkit for AI integration. Semantic may be more suitable for specific code-related tasks, whereas Semantic Kernel provides a more versatile platform for AI-powered applications. The choice between them depends on the project's requirements and the desired level of AI integration.

11,750

An open-source NLP research library, built on PyTorch.

Pros of AllenNLP

  • More comprehensive NLP toolkit with a wider range of pre-built models and tasks
  • Extensive documentation and tutorials, making it more accessible for beginners
  • Active community and regular updates

Cons of AllenNLP

  • Steeper learning curve due to its extensive feature set
  • Potentially slower performance for specific tasks compared to more specialized libraries

Code Comparison

AllenNLP:

from allennlp.predictors.predictor import Predictor

predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz")
result = predictor.predict(sentence="The cat sat on the mat.")

Semantic:

from semantic.api import SemanticApi

api = SemanticApi()
result = api.analyze("The cat sat on the mat.")

AllenNLP provides a more detailed and customizable approach, while Semantic offers a simpler API for quick analysis. AllenNLP's code demonstrates loading a specific model, whereas Semantic's API abstracts away model selection. AllenNLP is better suited for researchers and developers needing fine-grained control, while Semantic may be preferable for rapid prototyping or simpler NLP tasks.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Pros of transformers

  • Extensive library of pre-trained models for various NLP tasks
  • Active community and frequent updates
  • Comprehensive documentation and tutorials

Cons of transformers

  • Can be resource-intensive for large models
  • Steeper learning curve for beginners
  • Limited focus on semantic analysis compared to semantic

Code comparison

transformers:

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")[0]
print(f"Label: {result['label']}, Score: {result['score']:.4f}")

semantic:

{-# LANGUAGE OverloadedStrings #-}
import Semantic

main :: IO ()
main = do
  let src = "function foo() { return 42; }"
  tree <- parseTreeFromString JavaScript src
  print tree

The code snippets demonstrate the different focus areas of the two libraries. transformers provides high-level APIs for various NLP tasks, while semantic is more focused on parsing and analyzing source code.

transformers is better suited for general NLP tasks and offers a wide range of pre-trained models. semantic, on the other hand, excels in semantic analysis of source code and is more specialized for programming language processing.

30,447

💫 Industrial-strength Natural Language Processing (NLP) in Python

Pros of spaCy

  • Extensive language support with pre-trained models for multiple languages
  • Comprehensive documentation and active community support
  • Efficient and fast processing for large-scale text analysis

Cons of spaCy

  • Steeper learning curve for beginners compared to semantic
  • Less focus on semantic parsing and logical form extraction
  • May require more manual configuration for specialized NLP tasks

Code Comparison

spaCy:

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")

for ent in doc.ents:
    print(ent.text, ent.label_)

semantic:

{-# LANGUAGE OverloadedStrings #-}
import Semantic

main :: IO ()
main = do
  let text = "Apple is looking at buying U.K. startup for $1 billion"
  result <- runSemantic $ parse text
  print result

Note: The code examples demonstrate basic usage for each library. spaCy focuses on named entity recognition in this example, while semantic's approach is more general for parsing and analysis.

30,331

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Pros of fairseq

  • Broader scope: Supports a wide range of sequence modeling tasks, including machine translation, text summarization, and language modeling
  • Active development: Regularly updated with new features and improvements
  • Extensive documentation: Comprehensive guides and examples for various use cases

Cons of fairseq

  • Steeper learning curve: Requires more in-depth knowledge of NLP concepts
  • Higher resource requirements: May need more computational power for training and inference

Code Comparison

fairseq:

from fairseq.models.transformer import TransformerModel
en2de = TransformerModel.from_pretrained(
    '/path/to/checkpoints',
    checkpoint_file='checkpoint_best.pt',
    data_name_or_path='data-bin/wmt16_en_de_bpe32k'
)
en2de.translate('Hello world!')

semantic:

import Semantic.Api
import Semantic.Config

main :: IO ()
main = do
  config <- defaultConfig
  result <- runSemantic config $ do
    parseFile "path/to/file.py"
  print result

The code snippets demonstrate the different focus areas of the two projects. fairseq is geared towards NLP tasks, while semantic is designed for parsing and analyzing source code.

7,314

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Pros of Stanza

  • Supports a wide range of languages (over 60) for various NLP tasks
  • Provides pre-trained neural models for accurate linguistic annotations
  • Offers a Python interface with easy integration into existing workflows

Cons of Stanza

  • May have slower processing speed compared to Semantic
  • Requires more computational resources for running neural models
  • Limited focus on code analysis and programming language support

Code Comparison

Stanza example:

import stanza

nlp = stanza.Pipeline('en')
doc = nlp("Hello world!")
for sentence in doc.sentences:
    print([(word.text, word.upos) for word in sentence.words])

Semantic example:

{-# LANGUAGE OverloadedStrings #-}
import Semantic

main :: IO ()
main = do
  let src = "def hello(): print('Hello, world!')"
  ast <- parseFile Python src
  print ast

While Stanza focuses on natural language processing tasks, Semantic is tailored for parsing and analyzing source code across multiple programming languages. Stanza excels in linguistic annotations for human languages, whereas Semantic provides powerful tools for code analysis, making it more suitable for developers working with source code and programming languages.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Semantic

semantic is a Haskell library and command line tool for parsing, analyzing, and comparing source code.

In a hurry? Check out our documentation of example uses for the semantic command line tool.

Table of Contents
Usage
Language support
Development
Technology and architecture
Licensing

Usage

Run semantic --help for complete list of up-to-date options.

Parse

Usage: semantic parse [--sexpression | (--json-symbols|--symbols) |
                        --proto-symbols | --show | --quiet] [FILES...]
  Generate parse trees for path(s)

Available options:
  --sexpression            Output s-expression parse trees (default)
  --json-symbols,--symbols Output JSON symbol list
  --proto-symbols          Output protobufs symbol list
  --show                   Output using the Show instance (debug only, format
                           subject to change without notice)
  --quiet                  Don't produce output, but show timing stats
  -h,--help                Show this help text

Language support

LanguageParseAST Symbols†Stack graphs
Ruby✅✅
JavaScript✅✅
TypeScript✅✅🚧
Python✅✅🚧
Go✅✅
PHP✅✅
Java🚧✅
JSON✅⬜️⬜️
JSX✅✅
TSX✅✅
CodeQL✅✅
Haskell🚧🚧

† Used for code navigation on github.com.

  • ✅ — Supported
  • 🔶 — Partial support
  • 🚧 — Under development
  • ⬜ - N/A ️

Development

semantic requires at least GHC 8.10.1 and Cabal 3.0. We strongly recommend using ghcup to sandbox GHC versions, as GHC packages installed through your OS's package manager may not install statically-linked versions of the GHC boot libraries. semantic currently builds only on Unix systems; users of other operating systems may wish to use the Docker images.

We use cabal's Nix-style local builds for development. To get started quickly:

git clone git@github.com:github/semantic.git
cd semantic
script/bootstrap
cabal v2-build all
cabal v2-run semantic:test
cabal v2-run semantic:semantic -- --help

You can also use the Bazel build system for development. To learn more about Bazel and why it might give you a better development experience, check the build documentation.

git clone git@github.com:github/semantic.git
cd semantic
script/bootstrap-bazel
bazel build //...

stack as a build tool is not officially supported; there is unofficial stack.yaml support available, though we cannot make guarantees as to its stability.

Technology and architecture

Architecturally, semantic:

  1. Generates per-language Haskell syntax types based on tree-sitter grammar definitions.
  2. Reads blobs from a filesystem or provided via a protocol buffer request.
  3. Returns blobs or performs analysis.
  4. Renders output in one of many supported formats.

Throughout its lifecycle, semantic has leveraged a number of interesting algorithms and techniques, including:

Contributions

Contributions are welcome! Please see our contribution guidelines and our code of conduct for details on how to participate in our community.

Licensing

Semantic is licensed under the MIT license.