This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (
This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (
Home
Documentation for wukong lives at http://mrflip.github.com/wukong and in the repo itself, within the gh-pages branch.
If you’d like to contribute documentation here, please ping me and I’ll integrate it with the gh-pages; or (easier yet) just fork the repo and edit the gh-pages branch.
Go To the Hadoop Documentation
Index (possibly out of date)
- Tutorial
- Count Words
- Structured data
- Accumulators including a UniqByLastReducer and a GroupBy reducer.
- Wutils — command-line utilies for working with data from the command line
- Overview of wutils — command listing
- Stupid command-line tricks using the wutils
- wu-lign — present a tab-separated file as aligned columns
- Dear Lazyweb, please build us a tab-oriented version of the Textutils library
- Links and tips for configuring and working with hadoop
- Some opinionated thoughts on working with big data, on why you should drop acid, treat exceptions as records, and happily embrace variable-length strings as primary keys.
- Wukong is licensed under the Apache License (same as Hadoop)
- Work in progress: an intro to data processing with wukong:







