score:3
Accepted answer
promoting a comment to an answer...
it can't be done at the plain text level, as the semantic information around pages has been thrown away by that point
what you need to do is to extract the powerpoint file as xhtml, examples on how to do that from java here on the tika website. then, once you've got that, you'll see that the html has a structure like:
<body>
<div class="slideshow">
<div class="slide">
<div class="slide-master-content">
</div>
<div class="slide-content">
</div>
</div>
<div class="slide">
<div class="slide-content">
</div>
</div>
</div>
<div class="slide-notes">
</div>
</body>
so, you'll find a div for each slide, and within that you'll be able to see what's the slide itself, and what came from the slide master (if any). split that by slide divs, then grab the text out, and you're there!
Source: stackoverflow.com
Related Query
- How can I extract text slide by slide using Apache Tika (in Scala)?
- Using Scala reflection, how can I extract the trait mixin?
- How can I connect to a postgreSQL database into Apache Spark using scala?
- Scala: How can I replace value in Dataframes using scala
- How can I avoid mutable variables in Scala when using ZipInputStreams and ZipOutpuStreams?
- How can I fix missing conf files when using shadowJar and Scala dependencies?
- How to declare scala method so that it can be called from Java using varargs style
- Using Scala 2.10 reflection how can I list the values of Enumeration?
- How to read json data using scala from kafka topic in apache spark
- How can I create a TF-IDF for Text Classification using Spark?
- How can I deserialize from JSON with Scala using *non-case* classes?
- How can I select a non-sequential subset elements from an array using Scala and Spark?
- How to read a text file using Relative path in scala
- How can I customize Scala ambiguous implicit errors when using shapeless type inequalities
- Using IntelliJ, how can i determine whether particular function stems from Java or Scala
- How to run Multi threaded jobs in apache spark using scala or python?
- How to Compile Apache Spark with Scala 2.11.1 using SBT?
- scala json4s how can i extract field by condition
- How can you sign a POST with OAuth1.0a using scala play?
- How can I import a scala class into another using gatling?
- How can I handle decimal numbers using the Scala framework play
- How can I write a text file in Scala in Windows?
- How to create nested json using Apache Spark with Scala
- Scala - How to extract an XML file included in a generic text file
- Basic Scala reflection code using recursive types does not compile. Why ? How can it be fixed?
- How to extract a specific text from the string variable in scala
- How to Append the text file using stored value variable in Scala
- How to store the result of an action in apache spark using scala
- How can I compile Scala code without using scalac
- How to extract binary information from a database using Anorm with Scala
More Query from same tag
- Gatling: Try to check back-end is UP before starting simulation
- Java Wildcard Generic Type Interop From Scala
- Spark error: "ERROR Utils: Exception while deleting Spark temp dir:"
- Assign to a variable in the condition of a while loop - scala
- actorSelection does not respond when wildcard is used
- Multiple finally clauses
- What are the problems with an ADT encoding that associates types with data constructors? (Such as Scala.)
- Override default compile task in sbt
- How to create a Future of a map of a different type
- Sbt: call subproject's source code from root project
- What does func( arg: (type1, type2) => Future[Any]) mean in Scala?
- Azure Databrics - Running a Spark Jar from Gen2 DataLake Storage
- Why joining two spark dataframes fail unless I add ".as('alias)" to both?
- Explaining flatMap associativity
- scala method name as variable name
- Type erasure, generics and existential types
- Embed play in a standard Java web-app?
- Spark : how can i create local dataframe in each executor
- Scala write collection of objects to HttpResponse
- A method-local type alias
- idiomatic way to solve chicken-egg in scala
- array transpose in scala
- How to create a custom Encoder in Spark 2.X Datasets?
- Start HiveThriftServer programmatically in Python
- Scala State monad - combining different state types
- Why doesn't scala have C++-like const-semantics?
- Handling models with circular references in Spark SQL?
- Retrieving map value using key in Scala Play Framework view
- Actors or Future for methods with mathematical operations in Scala
- Pattern match on manifest instances of sealed class