Apache Druid querying library.
pkg> add Druid
Druid native queries documentation
using Druid
client = Client("http://localhost:8888")
timeseries_query = Timeseries(
dataSource=Table("wikipedia"),
intervals=[Interval("2015-09-12","2015-09-13")],
granularity=SimpleGranularity("hour"),
aggregations=[Count("total_rows"), SingleField("longSum", "added", "documents_added")]
)
println(execute(client, timeseries_query))
Druid SQL documentation
using Druid
client = Client("http://localhost:8888")
sql_query = Sql(query="""
SELECT FLOOR(__time TO HOUR) AS "timestamp", COUNT(*) AS "total_rows", SUM("added") AS "documents_added"
FROM wikipedia
WHERE __time >= TIMESTAMP '2015-09-12' AND __time < TIMESTAMP '2015-09-13'
GROUP BY FLOOR(__time TO HOUR)
ORDER BY "timestamp" ASC
""")
println(execute(client, sql_query))
Most queries return the query response as an object compatible with the
Tables.jl
interface. So it is quite easy to convert the result into another
compatible type, like DataFrame
.
result = execute(client, query)
df = DataFrame(result)
Compatible queries: Timeseries
, TopN
, GroupBy
, Scan
, Search
, Sql
.
Sql
query returns the result as either a Druid.SqlResult{ResultFormat}
or a
CSV.File
depending on the resultFormat
provided in the SQL query. Both are
compatible with the Tables.jl interface.
TimeBoundary
, SegmentMetadata
and DatasourceMetadata
return their results
as Dict
s.