Projects: twitter/scalding
Search results
Improve text formats
Updated Oct 7, 2017
We currently have a legacy reflection based TSV/CSV implementation. We also have a typeclass based FieldsDescriptor that can make such text formats correct.
We need to migrate everything to the new format, and we need to make FieldsDescriptor composable so users can more easily add their own implementations and also interop with the macros.
Search results
Improve the Serialization macros
Updated Oct 7, 2017
The current OrderedSerialization and Serialization macros have a couple of concerns.
- they don't use implicit recursion, so if you have a nested case class holding a custom type inside, we can't generate the OrderedSerialization. This is confusing for users and blocks some use cases.
- they don't compose as well as we would like: comparing a tuple2, without a length header and extra hoops, means you have to read both sides of the tuple. Even if the second part has a static size (or an easily readable size).
- The imports are very weird. There is some deeply nested function to import to get the macro implementations which is nearly undiscoverable.
We don't want to break source compatibility if we can help it for the most common use cases, but we want to address these issues.
Search results
Modularize the typed API
Updated Feb 15, 2018
Make the typed API independent of cascading allowing backends such as Spark and Flink. Modularize the optimizer so rules can be shared across backends.