score:20
Here's a Shapeless implementation that takes a slightly different approach from the one in your proposed example. This is based on some code I've written in the past, and the main difference from your implementation is that this one is a little more general—for example the actual CSV parsing part is factored out so that it's easy to use a dedicated library.
First for an all-purpose Read
type class (no Shapeless yet):
import scala.util.{ Failure, Success, Try }
trait Read[A] { def reads(s: String): Try[A] }
object Read {
def apply[A](implicit readA: Read[A]): Read[A] = readA
implicit object stringRead extends Read[String] {
def reads(s: String): Try[String] = Success(s)
}
implicit object intRead extends Read[Int] {
def reads(s: String) = Try(s.toInt)
}
// And so on...
}
And then for the fun part: a type class that provides a conversion (that may fail) from a list of strings to an HList
:
import shapeless._
trait FromRow[L <: HList] { def apply(row: List[String]): Try[L] }
object FromRow {
import HList.ListCompat._
def apply[L <: HList](implicit fromRow: FromRow[L]): FromRow[L] = fromRow
def fromFunc[L <: HList](f: List[String] => Try[L]) = new FromRow[L] {
def apply(row: List[String]) = f(row)
}
implicit val hnilFromRow: FromRow[HNil] = fromFunc {
case Nil => Success(HNil)
case _ => Failure(new RuntimeException("No more rows expected"))
}
implicit def hconsFromRow[H: Read, T <: HList: FromRow]: FromRow[H :: T] =
fromFunc {
case h :: t => for {
hv <- Read[H].reads(h)
tv <- FromRow[T].apply(t)
} yield hv :: tv
case Nil => Failure(new RuntimeException("Expected more cells"))
}
}
And finally to make it work with case classes:
trait RowParser[A] {
def apply[L <: HList](row: List[String])(implicit
gen: Generic.Aux[A, L],
fromRow: FromRow[L]
): Try[A] = fromRow(row).map(gen. from)
}
def rowParserFor[A] = new RowParser[A] {}
Now we can write the following, for example, using OpenCSV:
case class Foo(s: String, i: Int)
import au.com.bytecode.opencsv._
import scala.collection.JavaConverters._
val reader = new CSVReader(new java.io.FileReader("foos.csv"))
val foos = reader.readAll.asScala.map(row => rowParserFor[Foo](row.toList))
And if we have an input file like this:
first,10
second,11
third,twelve
We'll get the following:
scala> foos.foreach(println)
Success(Foo(first,10))
Success(Foo(second,11))
Failure(java.lang.NumberFormatException: For input string: "twelve")
(Note that this conjures up Generic
and FromRow
instances for every line, but it'd be pretty easy to change that if performance is a concern.)
score:2
Here's a solution using product-collections
import com.github.marklister.collections.io._
import scala.util.Try
case class Person(name: String, age: Int)
val csv="""Foo,19
|Ro
|Bar,24""".stripMargin
class TryIterator[T] (it:Iterator[T]) extends Iterator[Try[T]]{
def next = Try(it.next)
def hasNext=it.hasNext
}
new TryIterator(CsvParser(Person).iterator(new java.io.StringReader(csv))).toList
res14: List[scala.util.Try[Person]] =
List(Success(Person(Foo,19)), Failure(java.lang.IllegalArgumentException: 1 at line 2 => Ro), Success(Person(Bar,24)))
Apart from the error handling this gets pretty close to what you were looking for: val iter = csvParserFor[Person].parseLines(lines)
:
val iter = CsvParser(Person).iterator(reader)
score:6
Starting Scala 2.13
, it's possible to pattern match a String
s by unapplying a string interpolator:
// case class Person(name: String, age: Int)
val csv = "Foo,19\nRo\nBar,24".split("\n")
csv.map {
case s"$name,$age" => Right(Person(name, age.toInt))
case line => Left(s"Cannot read '$line'")
}
// Array(Right(Person("Foo", 19)), Left("Cannot read 'Ro'"), Right(Person("Bar", 24)))
Note that you can also use regex
es within the extractor.
It could help in our case to consider a row invalid if the age isn't an integer:
// val csv = "Foo,19\nRo\nBar,2R".split("\n")
val Age = "(\\d+)".r
csv.map {
case s"$name,${Age(age)}" => Right(Person(name, age.toInt))
case line @ s"$name,$age" => Left(s"Age is not an integer in '$line'")
case line => Left(s"Cannot read '$line'")
}
//Array(Right(Person("Foo", 19)), Left("Cannot read 'Ro'"), Left("Age is not an integer in 'Bar,2R'"))
score:14
kantan.csv seems like what you want. If you want 0 boilerplate, you can use its shapeless module and write:
import kantan.csv.ops._
import kantan.csv.generic.codecs._
new File("path/to/csv").asCsvRows[Person](',', false).toList
Which, on your input, will yield:
res2: List[kantan.csv.DecodeResult[Person]] = List(Success(Person(Foo,19)), DecodeFailure, Success(Person(Bar,24)))
Note that the actual return type is an iterator, so you don't actually have to hold the whole CSV file in memory at any point like your example does with Stream
.
If the shapeless dependency is too much, you can drop it and provide your own case class type classes with minimal boilerplate:
implicit val personCodec = RowCodec.caseCodec2(Person.apply, Person.unapply)(0, 1)
Full disclosure: I'm the author of kantan.csv.
Source: stackoverflow.com
Related Query
- Read CSV in Scala into case class instances with error handling
- Scala Case class matching compile error with aliased inner types?
- How do I read a non standard csv file into dataframe with python or scala
- How to "reads" into a Scala Case Class given a Json object with key names that start with a capital letter
- Error while extracting json into case class SCALA FLINK
- Error while Parsing json into scala case class
- Spark read multiple csv into one dataframe - error with path
- How to parse a csv with matching case class and store the output to treemap[Int, List[List[InputConfig]] object in scala
- JsError while trying to parse JSON file into case class with Play framework API in Scala
- Scala case class with escaped field name throws error during Spark Catalyst code generation
- How to update a mongo record using Rogue with MongoCaseClassField when case class contains a scala Enumeration
- Class broken error with Joda Time using Scala
- How to convert Row of a Scala DataFrame into case class most efficiently?
- How do I read a large CSV file with Scala Stream class?
- Scala case class extending Product with Serializable
- Shapeless - turn a case class into another with fields in different order
- Read case class object from string in Scala (something like Haskell's "read" typeclass)
- scala - parse json of more than 22 elements into case class
- Lift-json extract json with 'type' field into a case class
- Scala copy case class with generic type
- Scala wont pattern match with java.lang.String and Case Class
- Derived Scala case class with same member variables as base
- Scala spark: how to use dataset for a case class with the schema has snake_case?
- Converting a case class to CSV in Scala
- Replacing case class inheritance with extractors preserving exhaustiveness checks in Scala
- Problem with bounded type parameterised case class and default args in Scala
- How do I add a no-arg constructor to a Scala case class with a macro annotation?
- Scala case class with function parameters
- compare case class fields with sub fields of another case class in scala
- Modeling with Scala case class
More Query from same tag
- Akka.Kafka - warning message - Resuming partitions
- Spark agg to collect a single list for multiple columns
- Scala equivalent to pyTables?
- Scala throwing the NullPointer Exception
- How to convert fold() to an AggregateFunction in Flink for WindowFunction?
- implement a multiset/bag as Scala collection
- get java class for type tag of Scala case class
- Flink: Connected Components - type mismatch error
- compare 2 dates in a slick query seems hard
- scala flatMap flatten nested lists
- Getting resource path files from the context of a scala macro
- Scala: How can I split up a dataframe by row number?
- Injecting service to an actor in Play 2.4 on startup
- Is there a way to have this done implicitly?
- SortedSet fold type mismatch
- how to know if an actor exists in an actor system or not
- I have dataFrame as below and want to add remarks based on the column values using Scala
- Spark Scala println requires a collect()?
- Limit concurrent requests with Spray
- Scala equivalent of angular's pretty filter
- Unable to find class: org.apache.spark.h2o.package$StringHolder
- Scala : Error when using reflection to load class and methods
- Should I use partial function for database calls
- Why is "taking first n elements from an unsorted Map" meaningless?
- parse date error in scala
- Scala 3 TypeTest over Union doesn't seem to work correctly
- Finding the implicit value for Json in play
- Passing parameters to an object on construction
- STM - Ref.transform
- Convert RDD of Lists to Dataframe