score:5
Accepted answer
Assuming val txt = txt1 + txt2 + txt3
, you can wrap the text into an xml element as a string then parse it as XML and use the xml standard library to extract the anchors.
// can do other cleanup if necessary here such as changing "link!"
def normalize(t: String) = t.toLowerCase()
val txtAsXML = xml.XML.loadString("<root>" + txt + "</root>")
val anchors = txtAsXML \\ "a"
// returns scala.xml.NodeSeq containing the <a> tags
Then you just need to post process until you have the data organized like you want:
val tuples = anchors.map(a => normalize(a.text) -> a.attributes("href").toString)
// Seq[String, String] containing elements
// like "mp3" -> http://www.google.com/test.mp3
val byTypes = tuples.groupBy(_._1).mapValues(seq => seq.map(_._2))
// here grouped by types:
// Map(img -> List(http://www.google.com/test.jpg),
// link! -> List(http://www.google.com/),
// mp3 -> List(http://www.google.com/test.mp3))
Source: stackoverflow.com
Related Query
- Extract urls from string with type
- Extract numbers from string with rich string magic
- How to extract an element from an HList with a specific (parameterized) type
- Extract variables from string with Regex
- How to extract numbers with decimals from a string in scala?
- How to extract parameters from string tested with regular expression
- How to get manifest from string representation for type with parameters
- Change data type column from string to date with custom format
- Getting a structural type with an anonymous class's methods from a macro
- How to change the column type from String to Date in DataFrames?
- Extract words from a string column in spark dataframe
- How to extract number from string column?
- Implement Java Interface with Raw type from Scala
- Create a temporary file from a base64 string with rapture-io
- Scala regex extract domain from urls
- How to extract valid email from larger string in Scala
- Return a string from a Future onComplete case with Scala and Spray.io
- Working with nested maps from a JSON string
- Is there a way to extract the item type from a Manifest[List[X]] in Scala?
- How to extract tables with data from .sql dumps using Spark?
- How do I write a match type pattern with a narrowed String type head of tuple case in dotty?
- Extract double quoted string content with Parboiled
- How to extract file extension from string scala
- Conversion from String to BigDecimal do not with Cucumber on Scala
- Extract Number from string into a list in Scala
- Extract Type T from Seq[T]
- how to extract the column name and data type from nested struct type in spark
- Scala - calling a method with generic type parameter given a string value that determines the correct type
- How to read from textfile(String type data) map and load data into parquet format(multiple columns with different datatype) in Spark scala dynamically
- How to extract json from a jsonp string in Scala
More Query from same tag
- How to use scala.util.matching.Regex correctly?
- Add routes/special messages to akka router
- failure to send email by SendInBlue
- Scala Spark Convert Dataframe and get all Unique IDs and its type from each row
- Do I need to use @tailrec in Scala?
- Matrix Multiplication in Apache Spark
- How to copy some files to the build target directory with SBT?
- What is right way to store and retrieve sensitive and non-sensitive constants?
- unsupportedOperationException Error converting string to DateTime using Joda time
- Quill onconflictupdate multiple values
- create empty array-column of given schema in Spark
- How to mock BodyParser.Default parameter of ActionBuilder?
- Scala Cats Lifting values into Monad Transformers
- Why implicitConversions is required for implicit defs but not classes?
- Applications does not take parameters with scala.js or type
- Why does scalaz's implementation of Monoid for Option evaluate the f2 function twice?
- Dropping MySQL table with SparkSQL
- JDBC connection to MSSQL server using azure active directory
- Scala: Breeze (scalanlp.org) syntax?
- Remove all files with a given extension using scala spark
- Defining implicit view-bounds on Scala traits
- Why not set covariant as default when define subtype in Scala?
- Abstract Types / Type Parameters in Scala
- How to get separate RDD for each key entry
- scala: Polymorphic method with HashMap argument
- What is the Scala type-programming analogy for the `this` keyword?
- Fixed point in Scala
- Two exceptions "at the same time", what is the proper way to handle this situation?
- what is the correct syntax for squeryl to write or and?
- max() with struct() in Spark Dataset