Table of Contents

Class GraphGenerationTask<T>

Namespace
AiDotNet.Data.Structures
Assembly
AiDotNet.dll

Represents a graph generation task where the goal is to generate new valid graphs.

public class GraphGenerationTask<T>

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
GraphGenerationTask<T>
Inherited Members

Remarks

Graph generation creates new graph structures that follow learned patterns from training data. This is useful for generating novel molecules, designing new materials, creating synthetic networks, and other generative tasks.

For Beginners: Graph generation is like creating new objects that look realistic.

Real-world examples:

Drug Discovery:

  • Task: Generate novel drug-like molecules
  • Input: Training set of known drugs
  • Output: New molecular structures with desired properties
  • Goal: Discover new drug candidates automatically
  • Example: Generate molecules that bind to a specific protein target

Material Design:

  • Task: Generate new material structures
  • Input: Database of materials with known properties
  • Output: Novel material configurations
  • Goal: Design materials with specific properties (strength, conductivity, etc.)

Synthetic Data Generation:

  • Task: Create realistic social network graphs
  • Input: Real social network data
  • Output: Synthetic networks preserving statistical properties
  • Goal: Generate data for testing while preserving privacy

Molecular Optimization:

  • Task: Modify molecules to improve properties
  • Input: Starting molecule
  • Output: Similar molecules with better properties
  • Example: Improve drug efficacy while maintaining safety

Approaches:

  • Autoregressive: Generate nodes/edges one at a time
  • VAE: Learn latent space of graphs, sample new ones
  • GAN: Generator creates graphs, discriminator evaluates them
  • Flow-based: Learn invertible transformations of graph distributions

Properties

EdgeTypes

Possible edge types/labels (for categorical edge features).

public List<string> EdgeTypes { get; set; }

Property Value

List<string>

Remarks

In molecule generation, this could be bond types: single, double, triple, aromatic.

GenerationBatchSize

Number of graphs to generate per batch during training.

public int GenerationBatchSize { get; set; }

Property Value

int

GenerationMetrics

Metrics to track during generation (e.g., validity rate, uniqueness, novelty).

public Dictionary<string, double> GenerationMetrics { get; set; }

Property Value

Dictionary<string, double>

Remarks

Common metrics for graph generation: - **Validity**: Percentage of generated graphs that satisfy constraints - **Uniqueness**: Percentage of unique graphs (not duplicates) - **Novelty**: Percentage not in training set (not memorized) - **Property matching**: Do generated graphs have desired properties?

IsDirected

Whether to generate directed graphs.

public bool IsDirected { get; set; }

Property Value

bool

MaxNumEdges

Maximum number of edges allowed in generated graphs.

public int MaxNumEdges { get; set; }

Property Value

int

MaxNumNodes

Maximum number of nodes allowed in generated graphs.

public int MaxNumNodes { get; set; }

Property Value

int

Remarks

This constraint helps control computational cost and memory usage during generation.

NodeTypes

Possible node types/labels (for categorical node features).

public List<string> NodeTypes { get; set; }

Property Value

List<string>

Remarks

In molecule generation, this could be atom types: C, N, O, F, etc.

For Beginners: When generating molecules: - NodeTypes might be: ["C", "N", "O", "F", "S", "Cl"] - Each generated node must be one of these atom types - This ensures generated molecules use valid atoms

NumEdgeFeatures

Number of edge feature dimensions (0 if no edge features).

public int NumEdgeFeatures { get; set; }

Property Value

int

NumNodeFeatures

Number of node feature dimensions.

public int NumNodeFeatures { get; set; }

Property Value

int

TrainingGraphs

Training graphs used to learn the distribution.

public List<GraphData<T>> TrainingGraphs { get; set; }

Property Value

List<GraphData<T>>

Remarks

The generative model learns patterns from these graphs and generates similar ones.

ValidationGraphs

Validation graphs for monitoring generation quality.

public List<GraphData<T>> ValidationGraphs { get; set; }

Property Value

List<GraphData<T>>

ValidityChecker

Validity constraints for generated graphs.

public Func<GraphData<T>, bool>? ValidityChecker { get; set; }

Property Value

Func<GraphData<T>, bool>

Remarks

Custom validation function to check if a generated graph is valid. For molecules, this might check chemical valency rules.

For Beginners: Generated graphs must be valid/realistic:

Molecular constraints:

  • Carbon can have max 4 bonds
  • Oxygen typically has 2 bonds
  • No impossible bond types
  • Valid ring structures

Social network constraints:

  • No self-loops (people can't be friends with themselves)
  • Degree distribution matches real networks
  • Community structure makes sense

Validity constraints help ensure generated graphs are meaningful.