相关文章推荐
绅士的水煮肉  ·  neil young and the ...·  昨天    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

How can one use SparkListener callback functions? Can we only use it for logging info? Is there any way to accumulate sparkListener.onTaskEnd information for analysis in Spark application code?

For example..

sparkSession.sparkContext.addListener(new SomeListener)
class SomeListener {
  override def onTaskEnd(taskEnd: SparkListenerTaskEnd){
  // is there a proper/correct/valid way to accumulate this and use
  // it later in application  code?
  taskEnd.taskInfo

Can we get info gathered in callback functions?

This is in context of Spark Structure Streaming. It provides event lifecycle related information which would be helpful for logging, troubleshooting etc. Below is a small implementation.

sparkSession.streams.addListener(listenStreamingQuery)
  val listenStreamingQuery: StreamingQueryListener = new StreamingQueryListener() {
    def onQueryStarted(event: QueryStartedEvent): Unit = {
      println("Query started: " + event.id + " at: " + System.currentTimeMillis)
    def onQueryTerminated(event: QueryTerminatedEvent): Unit = {
      println("Query terminated: " + event.id + " at: " + System.currentTimeMillis)
    def onQueryProgress(event: QueryProgressEvent): Unit = {
      val queryProgress = event.progress
      println("Query progress: " + queryProgress)
      if (queryProgress.numInputRows > 0) {
        // Implement Logging. Query the in memory table 
       val currentDf = spark.sql("select * from schema_counts order by count desc")
        // group them together into a single spark partition and overwrite them into a hive table.
        currentDf.repartition(1).write.mode("overwrite").saveAsTable("qa.eventlogging_valid_mixed_schema_counts")
        currentDf.show()
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.