There should be some improvement made to the Doc Management features from within the UI. We will wait for the latest 2.0 version, as it is awaited to be much more mature than the 1.8-1.10 version. It is our understanding that it is limited by design. For example, sometimes we need to create cycles in our workflow but we are not able to, because Airflow supports only Direct Acyclic Graphs ( DAGs ) We need to develop our workflow description and notations because out of the box, Apache Airflow does not provide some features that are needed. Most of them are dispatched by existing engine, but not all. Currently, in production, we have a large installation with a complex workflow that includes hundreds of tasks. The code does not cover all tasks in the data warehouse automation process. There are some drawbacks to this solution. Legacy components can be from any platform, so if they could provide more client support for Java client library and Golang, that would be helpful. I would also like to see support for more platforms, in terms of programming BPMs. When we consider other products like jBPM, Camunda, or Cadence, they have the concept of pipelining. I would like it if a part of the output of your previous steps could be Apache input for your next step. If Airflow could come up with some kind of implementation, where not every step of the pipeline is an independent step, that would be helpful. Right now, we're using an external state repository to maintain the state. At the workflow level, we want to have common state management where, across steps, we'll be able to reach the state information. For example, Step 1 of my workflow will have output which I definitely want to automatically be provided as an input to my Step 2. Not every workflow can be implemented within Airflow. One specific feature that is missing from Airflow is that the steps of your workflow are not pipelined, meaning the stageless steps of any workflow. On top of that, the API sets that are provided are very limited. However, there's not a realistic case that we can get connected to them. We need to generate our own scripts and upload them and put them there. We have to build several pipelines for several flows, yet there's not a virtual control to generate them. There's no virtual control for pipelines. Being on the cloud, it should be easy to scale, however, it's not. The scalability of the solution itself is not as we expected. We needed to become a little bit creative to solve that ourselves. There's not really an ease of management out-of-the-box option for integration. We were creating our own way to integrate things specifically with specific tools. The management integration was challenging as well. Even though we customized it, there were some specific things we had to do with the image by itself. We have several areas where we feel they could improve in terms of being a little bit more flexible. The graphics in the past have not been ideal. If there were better features for the UI, like drag-and-drop, then we could expand its use to more of our team. Because that means we can also rely on non-technical staff, rather than just the three solid technical staff we have here. We want to do the ETL as code, but having a nice visual UI to facilitate this process would be great. Version 2 is supposedly better, but without having tried it, I could suggest more improvements in the visual UI. The UI is also not that attractive, and I feel that the user experience isn't that nice. For the most part, we have had to learn through trial and error how to operate it properly. Sometimes the UI becomes really slow and there's no easy way to diagnose the problem. When something fails, it's not that easy to troubleshoot what went wrong. Actually using Airflow is okay, but maintaining it has been difficult. In the earlier version that we're using, we sometimes have problems with maintenance complexity. We're currently using version 1.10, but I understand that there's a lot of improvements in version 2.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |