A library for working with Table Schema in Julia:
Table Schema is a simple language- and implementation-agnostic way to declare a schema for tabular data. Table Schema is well suited for use cases around handling and validating tabular data in text formats such as CSV, but its utility extends well beyond this core usage, towards a range of applications where data benefits from a portable schema format.
Table
class for working with data and schemaSchema
class for working with schemataField
class for working with schema fieldsvalidate
function for validating schema descriptorsinfer
function that creates a schema based on a data sample
๐ง This package is pre-release and under heavy development. Please see DESIGN.md for a detailed overview of our goals, and visit the issues page to contribute and make suggestions. For questions that need to a real time response, reach out via Gitter. Thanks! ๐ง
We aim to make this library compatible with all widely used approaches to work with tabular data in Julia.
Please visit our wiki for a list of related projects that we are tracking, and contibute use cases there or as enhancement issues.
See examples
folder and unit tests in runtests.jl for current usage.
using TableSchema
table = Table("cities.csv")
table.headers
# ['city', 'location']
table.read(keyed=True)
# [
# {city: 'london', location: '51.50,-0.11'},
# {city: 'paris', location: '48.85,2.30'},
# {city: 'rome', location: 'N/A'},
# ]
rows = table.source
# 6ร5 Array{Any,2}:
# "id" "height" "age" "name" "occupation"
# 1 10.0 1 "string1" "2012-06-15 00:00:00"
# 2 10.1 2 "string2" "2013-06-15 01:00:00"
# ...
err = table.errors # handle errors
...
schema = Schema("schema.json")
schema.fields
# <Field1, Field2...>
err = schema.errors # handle errors
Add fields to create or expand your schema like this:
schema = Schema()
field = Field()
field.descriptor._name = "A column"
field.descriptor.typed = "Integer"
add_field(schema, field)
๐ง Work In Progress. The following documentation is relevant only after package release. In the interim, please see DataPackage.jl
The package use semantic versioning, meaning that major versions could include breaking changes. It is highly recommended to specify a version range in your REQUIRE
file e.g.:
v"0.0.1-" <= TableSchema < v"1.0.0-"
At the Julia REPL, install the package with:
(v1.0) pkg> add "https://github.com/loleg/TableSchema.jl"
Code examples here require Julia 0.7, as we are now migrating to Julia 1.0. See Pkg documentation for further information.
Clone this repository, enter the REPL (press ]
at the Julia prompt) to activate and test it using:
cd <path-to-my-folder>/TableSchema.jl
julia
# Press ]
(v1.0) pkg> activate .
(TableSchema) pkg> test
You can also install the package locally and run unit tests from the console:
(v1.0) pkg> add .
julia test/runtests.jl
A new feature of Julia's package manager is the dev command. To get a copy of this package installed into your ~/.julia
folder and updated with every change, use:
(v1.0) pkg> dev TableSchema