HDFSiterator¶
Usage
use HDFSiterator;
Iterators for distributed iteration over Hadoop Distributed Filesystem
Iterators that can iterate over distributed data in an HDFS filesystem
in a distributed manner. See HDFS
.
-
iter
HDFSiter
(path: string, type rec, regex: string)¶ Iterate through an HDFS file (available in the default configured HDFS server) and yield records matching a regular expression.
Serial and leader-follower versions of this iterator are available.
Arguments: - path -- the path to the file within the HDFS server
- rec -- the type of the records to return
- regexp -- a regular expression with the same number of captures as
the number of fields in the record type
rec