score:0

This should do what you want:

  val splitter: String => Boolean = s => s.isEmpty
  val data = List("abc", "def", "", "ghi", "jkl", "", "mno", "pqr")

  def splitList[A](l: List[A], p: A => Boolean):List[List[A]] = {
    l match {
      case Nil => Nil
      case _ =>
        val (h, t) = l.span(a => !p(a))
        h :: splitList(t.drop(1), p)
    }
  }

  println(splitList(data, splitter))
//prints List(List(abc, def), List(ghi, jkl), List(mno, pqr))

score:0

UPDATE #2: Travis Brown answered another question using Scalaz-streams, an interesting package that might be helpful to you here. I am just starting to look at the package, but was quickly able to use it to read data from a file containing this:

abc
def

ghi
jkl
mno

pqr

and produce another file that looked like this:

Vector(abc, def, )
Vector(ghi, jkl, mno, )
Vector(pqr)

The library only holds the Vector being accumulated in memory. Here's my code (which should be considered dangerous, as I barely know anything about Scalaz-streams):

import scalaz.stream._
io.linesR("/tmp/a")
  .pipe( process1.chunkBy(_.nonEmpty) )
  .map( _.toString + "\n" )
  .pipe(text.utf8Encode)
  .to( io.fileChunkW("/tmp/b") )
  .run.run

Key to your task is the chunkBy(_.nonEmpty), which accumulates lines into a Vector until it hits an empty line. I have no idea at this point why you have to say run twice.

Old stuff below.

UPDATE #1: Ah! I just discovered the new constraint that it not all be read into memory. This solution isn't for you, then; you'd want Iterators or Streams.

I'm guessing that you'd want to enrich Traversable. And with the function in a separate argument list, the compiler can infer the types. For performance you would probably only want to make one pass over the data. And to avoid crashing with large datasets (and for performance), you wouldn't want any recursion that is not tail-recursion. Given this enricher:

implicit class EnrichedTraversable[A]( val xs:Traversable[A] ) extends AnyVal {
  def splitWhere( f: A => Boolean ) = {
    @tailrec
    def loop( xs:Traversable[A], group:Seq[A], groups:Seq[Seq[A]] ):Seq[Seq[A]] =
      if ( xs.isEmpty ) {
        groups :+ group
      } else {
        val x    = xs.head
        val rest = xs.tail
        if ( f(x) ) loop( rest, Vector(), groups :+ group )
        else        loop( rest, group :+ x, groups )
      }
    loop( xs, Vector(), Vector() )
  }
}

you can do this:

List("a","b","","c","d") splitWhere (_.isEmpty)

Here are some tests you might want to check out, to be sure the semantics are what you want (I personally like splits to behave this way):

val xs = List("a","b","","d","e","","f","g")    //> xs  : List[String] = List(a, b, "", d, e, "", f, g)
xs               splitWhere (_.isEmpty)         //> res0: Seq[Seq[String]] = Vector(Vector(a, b), Vector(d, e), Vector(f, g))
List("a","b","") splitWhere (_.isEmpty)         //> res1: Seq[Seq[String]] = Vector(Vector(a, b), Vector())
List("")         splitWhere (_.isEmpty)         //> res2: Seq[Seq[String]] = Vector(Vector(), Vector())
List[String]()   splitWhere (_.isEmpty)         //> res3: Seq[Seq[String]] = Vector(Vector())
Vector("a","b","","c") splitWhere (_.isEmpty)   //> res4: Seq[Seq[String]] = Vector(Vector(a, b), Vector(c))

score:0

I think Stream is what you want since they are evaluated lazily (not everything in memory).

def split[A](inputStream: Stream[A], splitter: A => Boolean): Stream[List[A]] = {
    var accumulationList: List[A] = Nil
    def loop(inputStream: Stream[A]): Stream[List[A]] = {
      if (inputStream.isEmpty) {
        if (accumulationList.isEmpty)
          Stream.empty[List[A]]
        else
          accumulationList.reverse #:: Stream.empty[List[A]]
      } else if (splitter(inputStream.head)) {
        val outputList = accumulationList.reverse
        accumulationList = Nil
        if (outputList.isEmpty)
          loop(inputStream.tail)
        else
          outputList #:: loop(inputStream.tail)
      } else {
        accumulationList = inputStream.head :: accumulationList
        loop(inputStream.tail)
      }
    }
    loop(inputStream)
  }

  val splitter = { s: String => s.isEmpty }
  val list = List("asdf", "aa", "", "fw", "", "wfwf", "", "")
  val stream = split(list.toStream, splitter)
  stream foreach println

The output is:

List(asdf, aa)
List(fw)
List(wfwf)

EDIT:
I have not looked at it in detail, but I guess my recursive method loop could be replaced by a foldLeft or foldRight.

score:0

Here it is:

scala> val data = List("abc", "def", "", "ghi", "jkl", "mno", "", "pqr").iterator
data: Iterator[String] = non-empty iterator

scala> val splitter: String => Boolean = s => s.isEmpty
splitter: String => Boolean = <function1>


scala> def f[A](in: Iterator[A], sf: A => Boolean): Iterator[Iterator[A]] = 
         in.hasNext  match {
     |   case false => Iterator()
     |   case true  => Iterator(in.takeWhile(x => !sf(x))) ++ f(in, sf)
     | }
f: [A](in: Iterator[A], sf: A => Boolean)Iterator[Iterator[A]]

scala> f(data, splitter) foreach (x => println(x.toList))
List(abc, def)
List(ghi, jkl, mno)
List(pqr)

More Query from same tag