Task Scheduling on PCF
Kubernetes support enhancements
App hosting tool
Composed Task Runner security
DSL and deployment property parsing refinements
Batch Database Schema and Optimization
A typical workflow for batch data processing involves scheduling batch applications. For example, the scheduler system accepts a cron expression and launches the application whenever the expression matches the current time.
Data Flow provides the ability to schedule and unschedule a task definition. The schedule is based on a cron expression. Building upon the PCF Java Client the team has created a portable scheduler interface in the Spring Cloud Scheduler SPI project (Service Provider Interface) and an implementation for PCF, Spring Cloud Scheduler for Cloud Foundry. The Dashboard provides access to schedule and unschedule a task as shown in the screenshot below.
The stream deployment history is available for review from the Dashboard. It is convenient to review the context-specific history of a stream from a central location; especially, when the CI/CD systems continually deploy new version application artifacts that belong to the stream.
The Task/Jobs and About tabs have been redesigned to be consistent with rest of the UI sections. The bulk operations, paginations, layout, and the general look and feel of the views have been modernized. Previously the task execution status was stored but not displayed in the shell or the UI. Now it is displayed :)
The routing and navigation between the Task tabs, sub-tabs and page views have gone through a update. You will notice improvements in state management navigating between the list to details page, and vice versa.
The SCDF Dashboard and Spring Flo stack have been upgraded to Angular 6. Several downstream dependencies including JointJS were updated as well. Though the test harness runs through a variety of browsers for incremental validation, if you see any abnormalities in different browsers, feel free to open an issue or bring it up in Gitter or StackOverflow. We appreciate any feedback.
A few improvements including the support for deploying Boot Apps with secured actuators, so the liveness and readiness probes can resolved at runtime.
The ability to pass custom
Service Account Name is now possible for each stream/task deployment. This in particular is useful for scenarios where different stream/task deployments require different security permissions.
While Maven is the recommended approach for Stream/Task App artifact resolution, some users cannot use Maven for a variety of reasons. We have also heard about customers installing SCDF in a no-internet zone and can’t reach out to resolve Stream/Task artifacts via Maven, HTTP or a Docker registry.
To address these concerns, we have developed an App Hosting Tool, which mimics a standalone App repository, but in reality, is a Spring Boot App serving the App artifacts through HTTP. You can read more about the App Tool and the getting-started instructions from here.
With continuing interest from community, we have added support to enable secured access between Composed Task Runner and the Data Flow server. We have added basic authentication support and will add the other security options supported by Data Flow in upcoming releases.
Launching Tasks with custom arguments is a great approach to influence the Task application with differing behaviors at runtime. Imagine influencing the batch-job (running as a Task) that accepts timezone as an argument to perform timezone specific data processing. In this release, we have adapted the parsing logic to include key-value pairs as values. Thanks to the community for reporting, giving us feedback, and sharing of their use-cases.
While reviewing the parsing rules for in-line vs. property files based properties for stream and task definitions, the community has found a difference in behavior, and that we have documented it for general guidance.
Thanks to the community for thorough validation and feedback on the database schema. The batch and task schemas have enhanced for the cases when there are large numbers of Task executions for MySQL and PostgreSQL. Optimizations for other databases are on their way.