We often want to count things and a way to do that is to create a dictionary
that maps objects to their counts. A Counter
object simplifies that
process. Say we want to count values of type String
. We would
create a counter for that type like this:
julia> c = Counter{String}()
Counter{String} with 0 entries
The two primary operations for a Counter
are value increment and
value retrieval. To increment the value of a counter we do this:
julia> c["hello"] += 1
1
To access the count, we use square brackets:
julia> c["hello"]
1
julia> c["bye"]
0
Notice that we need not worry about whether or not a key is
already known to the Counter
. If presented with an unknown key,
the Counter
assumes its value is 0
.
A Counter
may be assigned to like this c["alpha"]=4
but
the more likely use case is c["bravo"]+=1
invoked each
time a value, such as "bravo"
is encountered.
The function counter
(lowercase 'c') counts the element of a list/array
or set. The multiplicity of an element is the number of times it
appears in the list.
julia> A = [ "alpha", "bravo", "alpha", "gamma" ];
julia> C = counter(A);
julia> C
Counter{String} with these nonzero values:
alpha ==> 2
bravo ==> 1
gamma ==> 1
If c
and d
are counters (of the same type of object) their sum
c+d
creates a new counter by adding the values in c
and d
. That
is, if a=c+d
and k
is any key, then a[k]
equals c[k]+d[k]
.
To increment the count of an item x
in a counter c
we may either
use c[x]+=1
or the increment function like this: incr!(c,x)
.
The increment function incr!
is more useful for incrementing a
collection of items. Use incr!(c,items)
to add 1 to the count
for each element held in items
. If an element is present in items
multiple times, its count is incremented for each occurrence.
julia> c = Counter{Int}()
Counter{Int64} with 0 entries
julia> items = [1,2,3,4,1,2,1]
7-element Array{Int64,1}:
1
2
3
4
1
2
1
julia> incr!(c,items)
julia> c
Counter{Int64} with these nonzero values:
1 ==> 3
2 ==> 2
3 ==> 1
4 ==> 1
In addition, incr!
may be used to increment one counter
by the amount held in another. Note that it's the first argument c
that gets changed; there is no effect on the second argument d
.
Note: incr!(c,d)
and c += d
have the same effect, but the first
is more efficient.
julia> c = Counter{Int}()
Counter{Int64} with these nonzero values:
julia> items = [1,2,3,4,1,2,1]
7-element Vector{Int64}:
1
2
3
4
1
2
1
julia> incr!(c,items)
julia> c
Counter{Int64} with these nonzero values:
1 ==> 3
2 ==> 2
3 ==> 1
4 ==> 1
sum(c)
returns the sum of the values inc
; that is, the total of all the counts.length(c)
returns the number of values held inc
. Note that this might include objects with value0
.nnz(c)
returns the number of nonzero values held inc
.keys(c)
returns an iterator for the keys held byc
.values(c)
returns an iterator for the values held byc
.display(c)
gives a print out of all the keys and their nonzero values inc
.clean!(c)
removes all keys fromc
whose value is0
. This won't change its behavior, but will free up some memory.
We can convert a Counter
into a one-dimensional
array in which each element appears with its appropriate multiplicity
using collect
:
julia> C = Counter{Int}()
Counter{Int64} with 0 entries
julia> C[3] = 4
4
julia> C[5] = 0
0
julia> C[-2] = 2
2
julia> collect(C)
6-element Array{Int64,1}:
3
3
3
3
-2
-2
The function collect_by_counts
lists the elements of a Counter
once each,
but in decreasing order of their counts. That is, the element with the highest count
is first, the element with the second highest count is second, and so forth.
Elements whose count is zero are not listed.
julia> collect_by_counts(C)
2-element Vector{Int64}:
3
-2
If the objects counted in C
are numbers, then we compute the weighted
average of those numbers with mean(C)
.
julia> C = Counter{Int}()
Counter{Int64} with 0 entries
julia> C[2] = 3
3
julia> C[3] = 7
7
julia> mean(C)
2.7
hash(C::Counter)
returns a hash value for the C
. Note that
clean!
is applied to C
before computing the hash. This is
done to ensure that equal counters give the same hash value.
May also be invoked as hash(C::Counter, h::Uint)
.
A Counter
is a subtype of Associative
and therefore we can
use methods such as keys
and/or values
to get iterators to
those items.
The function csv_print
writes a Counter
to the screen in
comma-separated format. This can be readily used for importing
into a spreadsheet.
julia> C = Counter{Float64}()
Counter{Float64} with 0 entries
julia> C[3.4]=10
10
julia> C[2.2]=3
3
julia> csv_print(C)
2.2, 3
3.4, 10
See the parallel-example directory for an illustration of how to
use Counters
in multiple parallel processes.