Configuration

Overview

At the basic level, the configuration class is used to define which tokens are to be used in your language. When you first create the class and extend the LARFConfig class, several methods are required to be implemented:

public class MyLanguageConfig extends LARFConfig {

    public IndentConfig() {
        super("My Language", 0.5);
    }

    @Override
    protected void initFunctions() { }

    @Override
    protected void initTokenHandlers() { }

    @Override
    public Optional<TokenModifier> getDefaultModifier() {
        return Optional.empty();
    }

    @Override
    protected TypeOperation initTypeOperations() { return null; }

    @Override
    protected void initOperators() { }

    @Override
    protected void initParserFormatters() { }

    @Override
    protected void initErrorHandlers() { }
}

Firstly, a call to the superclass constructor method can be used to set the name and version of the language. This is used so that when code is loaded, a check is made to ensure that versions support backwards compatibility, but will throw an error if the language is not correct or that the version contained in the file is greater than the version running. This ensures that issues don't occur with missing tokens. The same checks are implemented on most languages. In Java this looks like the following:

java.lang.UnsupportedClassVersionError: Unsupported major.minor version 51.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClassCond(Unknown Source)

The constructor can also be used to set properties which affect how your language operates. The other methods are used for the following:

initFunctions: Used to define any system functions to provide functionality from the underlying language (Java). For an example, please see the System Functions section for more details.
initTokenHandlers: Used to define all literals and statement tokens. Your tokens may require an argument for the value when creating a new instance e.g. addTokenHandler(new MyToken(null)), but passing null in this instance is fine as it is used for pattern matching and the creation of new instances once a match is made. See Token Class.
getDefaultModifier: If a variable or object is defined without using a modifier, this method determines which modifier will be used to enforce encapsulation rules. If none is specified (Optional.empty), then the default will be to lock down access to only be accessible from within the object it resides i.e. private.
initTypeOperations: This is used to define type operations that handle interactions between tokens and operators. The method requires that a token handler be returned which is used as a default handler for basic operations. For example, you might define a simple equals / not equals handler for the any token values. Please see the Type Operations for more details.
initOperators: Defines operators using either one of the inbuilt templates or a custom set (see Operators)
initParserFormatters: Data structures like collections and maps may store their contents using tokens depending on their implementation. Formatters are used to map their token structure used in the Parser back to their Java equivalents if required. Please see Formatters.
initErrorHandlers: Allows handlers to be defined for in-language errors. This could be as simple as checked / unchecked or handling thrown Java exceptions for null or Arithmetic events. See Error Handling for more details.

Common Properties

Properties can change how your language operates and behaves. Properties can be set either on the configuration object itself e.g. myLanguageConfig.setProperty(DefaultProperty.DEBUG_MODE, true); or within the constructor:

public class MyLanguageConfig extends LARFConfig {

    public IndentConfig() {
        super("My Language", 0.5);
        setProperty(DefaultProperty.DEBUG_MODE, true);
    }

    //...
}

Common properties can be found in the DefaultProperty enum class and are listed below:

Value	Type	Description
DATE_FORMAT	String	This default date pattern used by the in-built DATE function e.g. `setProperty(DefaultProperty.DATE_FORMAT, "dd-MM-yyyy")` would support specifying `DATE(23-05-1999)`. A second argument can be provided to the date function with a custom pattern.
DEBUG_MODE	Boolean	This enabled in-depth logging of all parser and token operations e.g. `setProperty(DefaultProperty.DEBUG_MODE, true)`. This is useful when debugging your own tokens / statements.
SAFE_OPERATIONS	Boolean	This can be used as a flag in your language to enable / disable certain features. For example, if you defined a Token that used reflection to import Java objects into your language, you may want to restrict method invocations to avoid unwanted code being run (code injection attacks if running on a server).
STRICT_SYNTAX	Boolean	It is strongly recommended to keep this setting enabled along with using the Reference Token. Without this the lexer will allow unknown values to be accepted which could cause unxepected results.
NOTATION_TYPE	ExpressionNotationType	This determines the order in which operators and values are evaluated. There are three types which are PREFIX, INFIX and POSTFIX.
CODE_BLOCK_STYLE	CodeBlockStyle	Language code blocks are typically split between those which start and end with a character or phrase, or alternatively use indentation (whitespace). LARF supports three options for code-blocks which are DELIMITER, WHITESPACE_FIXED, WHITESPACE_IDENTIFY. Please see here for more details.
WHITESPACE_VALUE	String	When using the CODE_BLOCK_STYLE property with the WHITESPACE_FIXED option, this value determines the fixed value used to represent each stacked indentation code-block. For example, you may choose to specify four spaces or a tab. Alternatively, you can specify multiple of these values using an or (pipe i.e. "...\|..."). Tabs can be represented by using `\t`.
STRICT_WHITESPACE	Boolean	When using WHITESPACE_FIXED or WHITESPACE_IDENTIFY, if a line is defined which doesn't follow either the pattern determined by WHITESPACE_VALUE or the pattern identified using WHITESPACE_IDENTIFY then an error is thrown.
LANGUAGE_TYPED	LanguageTyped	Languages can either be typed or typeless. Typed languages require a type to be assigned to values within context. If a value is assigned which is not compatible then errors will be thrown.
FLAG_NATIVE_ERRORS	Boolean	When an error is thrown from the underlying language and mapped to an in-language error (See Error Checking) then by default a JVM stack trace will be included. If this is disabled then only the language stack trace will be provided.
JVM_TRACE_LIMIT	Integer	If the FLAG_NATIVE_ERRORS is enabled, this option determines the number of lines of the stack trace to include. This is useful as JVM error and the accompanying stack trace can be quite extensive.
JUMP_SUPPORT	Boolean	Adds support for jumps within a language. This provides the ability for a program to jump directly to defined label (See Jumping for more information).
FORWARD_JUMPING	Boolean	By default, jumping through using JUMP_SUPPORT only supports backward jumping. If this option is enabled then jumps forward are permitted.
GLOBAL_SCOPE	Boolean	Values stored to context obey scope by default. As such, code which does not share that same scope cannot access those values directly. This suits most languages, but may want to treat all values for simplicity as accessible. This option will disable variable scoping and make all values irrespective of where they're declared accessible.
CASE_SENSITIVE	Boolean	When writing a language, you may want to allow people to use any case. For example, if you were defining a new language you could allow people to use `copy R1 to R2` or `COPY R1 TO R2`.

Custom Properties

You can define custom properties using an overloaded version of the setProperty method. For example, to set a property to enable some of the more experimental features of your language you could use myLanguageConfig.setProperty("experimentalFeatures", true);.You can then read this from any Token process(...) method using boolean enableFeatures = config.getProperty("experimentalFeatures", Boolean.class);.

Configuration

Overview​

Common Properties​

Overview

Common Properties