Skip to content

Commit

Permalink
updates
Browse files Browse the repository at this point in the history
Signed-off-by: (Bit-Mage) <[email protected]>
  • Loading branch information
(Bit-Mage) committed Nov 1, 2024
1 parent de2dcd0 commit 308386a
Show file tree
Hide file tree
Showing 11 changed files with 123 additions and 33 deletions.
37 changes: 30 additions & 7 deletions Content/20230717135201-data_engineering.org
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
- will be indexing into relevant nodes from the stream sub nodes instead.
* Core Nodes
** Data Engineering Lifecycle
*** Overview
#+begin_src plantuml :file ./images/data-eng-lifecycle.png :exports both
@startuml

Expand Down Expand Up @@ -47,16 +48,38 @@ frame Applications {
Serving =right=> Applications
@enduml
#+end_src

#+RESULTS:
[[file:./images/data-eng-lifecycle.png]]

** Undercurrents
*** [[id:6e9b50dc-c5c0-454d-ad99-e6b6968b221a][Security]]
*** Data Management
*** DataOps
*** Data Architecture
*** [[id:f822f8f6-89eb-4aa8-ac8f-fdcff3f06fb9][Orchestration]]
*** [[id:5c2039f5-0c44-4926-b2d7-a8bf471923ac][Software Engineering]]
**** Generation
- source systems : origins of data in the lifecycle
- possibilities:
- [[id:b8f679c7-3ac1-48d7-b1b5-8e4743a62767][IoT]] device
- application [[id:1073cfed-a09d-48b6-bd52-ba09708699bf][message queue]]
- transactional [[id:2f67eca9-5076-4895-828f-de3655444ee2][database]]
- the data engineer consumes from the source systems but doesn't own them
- practical examples:
- application database
- IoT swarms
**** [[id:18491388-2dcc-488f-8f33-00582cf0f77e][Storage]]
- data architectures leverage several storage solutions for all kinds of flows, stores and transitions
- they also need to have side-car processing capabilities to serve complex queries
- storage is omnipresent across the cycle from ingestion to serving results and the transformations sandwiched within
- streaming frameworks like [[id:fa58feb4-25a2-40f1-8533-cafcb0d3886b][apache kafka]] and [[id:5e438030-0096-4b97-8931-f99eb7b738c5][pulsar]] can simultaneously function as ingestion, storage and query systems for messages
**** Ingestion
**** Transformation
**** Serving
**** Applications
*** Undercurrents
**** [[id:6e9b50dc-c5c0-454d-ad99-e6b6968b221a][Security]]
**** Data Management
**** DataOps
**** Data Architecture
**** [[id:f822f8f6-89eb-4aa8-ac8f-fdcff3f06fb9][Orchestration]]
**** [[id:5c2039f5-0c44-4926-b2d7-a8bf471923ac][Software Engineering]]
*** [[id:9204583f-13ab-4039-9bfc-453700f8b0d1][The Data Life Cycle]]
- The Data engineering lifecycle is a subset of the data life cycle (explored separately)
** [[id:710e11f8-780a-4aa5-84fc-c0ab9bb848c0][Big Data]]
* Tooling
** [[id:7aa94354-25d9-441b-993f-31ccc970edd3][Hadoop]]
Expand Down
3 changes: 1 addition & 2 deletions Content/20231030092756-robotics.org
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,4 @@
:ID: f1ec552e-a7c4-47ae-9dd2-a23733d1da92
:END:
#+title: Robotics
#+filetags: :tbp:

#+filetags: :electronics:
2 changes: 2 additions & 0 deletions Content/20231227162344-computer_networks.org
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ I'll be exploring networks in a whimsical manner across several domains and will
Also initializing a stream for a lot of the other domains that I explore will be rooted to this node in an unstructered manner.

* Stream
** 0x22F5
- exploring swarm networks
** 0x22EC
- speed reading
** 0x22EB
Expand Down
23 changes: 0 additions & 23 deletions Content/20240220114146-electronic_storage.org
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,4 @@
#+title: Electronic Storage
#+filetags: :electronics:cs:

Exploring this from the ground up - via electrons (Physics) to Logic Gates (Compute Science).

Will be unstructured as I'll be visiting this node from different perspectives over time.

* Misc Technical

- Some degrees of freedom in the context that enable variations in the physical realization of storage device are:-
1. Speed of access
2. The underlying Physics of storage
3. Persistence of the data

For instance, speaking about two distinct instances:
- [[id:24f37c35-4292-437b-b814-864251f1e44f][qubits]] (quantum information theory)
- the smoothened binary notion of day and night at the equator based on the position of the sun.


* Sentinels
** Cache
:PROPERTIES:
:ID: c8a3e246-0f29-4909-ab48-0d34802451d5
:END:
- high speed memory taking advantage of the temporal locality of reference principle -> recenlty accessed data is likely to be accessed again.

- caches are a good first step towards improving a [[id:2f67eca9-5076-4895-828f-de3655444ee2][DataBase's]] performance under multiple accesses.
1 change: 1 addition & 0 deletions Content/20240717095231-message_brokers.org
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
:PROPERTIES:
:ID: 1073cfed-a09d-48b6-bd52-ba09708699bf
:ROAM_ALIASES: "Message Queue"
:END:
#+title: Message Brokers
#+filetags: :programming:tool:data:
Expand Down
1 change: 1 addition & 0 deletions Content/20241010100357-biomimicry.org
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
:PROPERTIES:
:ID: 2ac1cb5c-fd21-41a7-a30a-d6a2080d973e
:ROAM_ALIASES: bioMimetics
:END:
#+title: bioMimicry
#+filetags: :biology:
8 changes: 7 additions & 1 deletion Content/20241031150229-data_science_hierarchy_of_needs.org
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,15 @@ From Upstream (root initiatives) to Downstream (consequent initiatives)
** [[id:a9f08fcf-c62d-40c0-a7fb-53d7f827b5ea][anomaly detection]]
** prepprocessing/preparation
* aggregate/label
** Analytics
** Metrics
** Segments
** Aggregates
** Features
** Training data
* learn/optimize
** [[id:85ff1796-5245-4b42-8f97-64b1fc9487e0][A/B testing]]
** Experimentation
** simpler ML algorithms
* learn/optimize
** [[id:db649cb6-047e-426e-8cdc-774586ef30a0][AI]]
** [[id:20230713T110040.814546][Deep Learning]]
5 changes: 5 additions & 0 deletions Content/20241101165524-the_data_life_cycle.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
:PROPERTIES:
:ID: 9204583f-13ab-4039-9bfc-453700f8b0d1
:END:
#+title: The Data Life Cycle
#+filetags: :data:
40 changes: 40 additions & 0 deletions Content/20241101165737-iot.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
:PROPERTIES:
:ID: b8f679c7-3ac1-48d7-b1b5-8e4743a62767
:END:
#+title: IoT
#+filetags: :iot:

* Overview
** *Definition and Scope*
- The [[id:24f4040a-7c18-416a-8460-e69280d437bf][Internet]] of Things (IoT) refers to the [[id:a4e712e1-a233-4173-91fa-4e145bd68769][network]] of physical objects embedded with [[id:0bb707ba-24a5-44b3-8e23-45ade88f605c][sensors]], [[id:d9a3aabe-114b-43c6-81f9-ca6e01ed3f46][software]], and other technologies to connect and exchange data with other devices and systems over the internet.

** *Key Components*
- *Devices and Sensors*: Physical objects (often referred to as 'things') equipped with sensors and actuators. Examples include smart home devices, wearable health monitors, and industrial sensors.
- *Connectivity*: Communication protocols that enable connection and data exchange between IoT devices and systems. These include Wi-Fi, Bluetooth, Zigbee, and cellular networks.
- *Data Processing and Analytics*: Systems that gather, process, and analyze data collected from IoT devices, providing valuable insights and enabling automated responses.

** *Applications*
- *Smart Home*: Devices for home automation, such as smart thermostats, lighting systems, and security cameras.
- *Wearable Technology*: Wearable devices that monitor health and fitness parameters, like smartwatches and fitness trackers.
- *Industrial IoT (IIoT)*: Implementations in manufacturing, logistics, and supply chain management to improve efficiency and predictive maintenance.
- *Healthcare*: Remote monitoring devices for patient health, improving delivery of care and management of chronic diseases.
- *Smart Cities*: Urban infrastructure using IoT for traffic management, waste management, and environmental monitoring.

* IoT [[id:cf3fce52-77ad-4d0d-b934-0a87978f4f46][swarms]]
** *Definition and Concepts*
- IoT Swarms refer to groups of interconnected IoT devices working collaboratively to achieve a common goal. These can be compared to biological swarms (like those of bees or birds) where each entity participates in a larger system or function.
- Communication and Coordination: IoT swarms rely heavily on peer-to-peer communication and require sophisticated algorithms to coordinate actions among the devices.

** Applications and Use Cases
- Environmental Monitoring: Swarms of drones that can autonomously collect data over large areas, providing insights into climate patterns or disaster management.
- Smart Agriculture: Utilizing swarms of IoT devices to automate and optimize farming processes, like watering, seeding, or pest control.
- Search and Rescue: Deploying swarms of drones or robots in search and rescue missions, where they can survey large areas quickly and efficiently.

** Challenges and Considerations
- Scalability: Ensuring that the system can handle the coordination of potentially thousands of devices without bottlenecks.
- Latency and Responsiveness: Maintaining low-latency communication to ensure timely coordination and response between devices.
- Security: Protecting data integrity and preventing unauthorized access to the swarm network.

** Connections and Implications
- The concept of IoT swarms connects with concepts from distributed computing, autonomous systems, and machine learning, as these technologies can help manage and optimize swarm operations.
- IoT swarm developments may revolutionize areas such as logistics, disaster response, and environmental conservation through enhanced automation and operational efficiency.
27 changes: 27 additions & 0 deletions Content/20241101170336-swarm_networks.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
:PROPERTIES:
:ID: cf3fce52-77ad-4d0d-b934-0a87978f4f46
:END:
#+title: Swarm Networks
#+filetags: :meta:

* Overview
** *Definition:*
- Swarm Networks involve the collective behavior of decentralized and self-organized systems. Typically, the term is inspired by [[id:2ac1cb5c-fd21-41a7-a30a-d6a2080d973e][biological systems]] such as ant colonies, bird flocking, or fish schooling.

** *Characteristics:*
- Distributed control without a centralized authority.
- Robustness to errors and failures due to redundancy across the network.
- Scalability allows the network to grow in size without a linear increase in complexity.

** *Applications:*
- [[id:f1ec552e-a7c4-47ae-9dd2-a23733d1da92][Robotics]]: Swarm robotics utilize multiple robots to achieve tasks collectively that individual units cannot accomplish alone.
- Telecommunications: Network protocols can leverage swarm intelligence for routing and data dissemination.
- Optimization Problems: Algorithms like Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) resolve complex computational problems by simulating swarm behaviors.

** *Technologies in Use:*
- [[id:b8f679c7-3ac1-48d7-b1b5-8e4743a62767][IoT]] devices often utilize principles of swarm intelligence to manage network traffic effectively.
- Blockchain technology can leverage swarm principles for decentralized consensus mechanisms.

** *Challenges:*
- Coordination and communication overhead in large-scale networks.
- Security threats due to the decentralized nature and potential for malicious entities to disrupt operations.
9 changes: 9 additions & 0 deletions Content/20241101175831-cache.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
:PROPERTIES:
:ID: c8a3e246-0f29-4909-ab48-0d34802451d5
:END:
#+title: Cache
#+filetags: :data:

- high speed memory taking advantage of the temporal locality of reference principle -> recenlty accessed data is likely to be accessed again.

- caches are a good first step towards improving a [[id:2f67eca9-5076-4895-828f-de3655444ee2][DataBase's]] performance under multiple accesses.

0 comments on commit 308386a

Please sign in to comment.