Quantifying the design-space tradeoffs in autonomous drones

With fully autonomous flight capabilities coupled with user-specific applications, drones, in particular quadcopter drones, are becoming prevalent solutions in myriad commercial and research contexts. However, autonomous drones must operate within constraints and design considerations that are quite different from any other compute-based agent. At any given time, a drone must arbitrate among its limited compute, energy, and electromechanical resources. Despite huge technological advances in this area, each of these problems has been approached in isolation and drone systems design-space tradeoffs are largely unknown. To address this knowledge gap, we formalize the fundamental drone subsystems and find how computations impact this design space. We present a design-space exploration of autonomous drone systems and quantify how we can provide productive solutions. As an example, we study widely used simultaneous localization and mapping (SLAM) on various platforms and demonstrate that optimizing SLAM on FPGA is more fruitful for the drones. Finally, to address the lack of publicly available experimental drones, we release our open-source drone that is customizable across the hardware-software stack.


INTRODUCTION & MOTIVATION
Over the last decade, significant progress has been made in the development of autonomous systems. The numerous advances in Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. ASPLOS '21, April 19ś23, 2021 drones popularized by quadcopters [1] is partly due to the countless applications addressed by these systems, such as aerial mapping [2,3], exploration [4,5], military [6], natural disaster recovery [7], search and rescue [8], ecology [9,10], and entertainment [11ś14]. The quadcopter design possesses many advantages over other aerial vehicle designs in terms of simplicity and efficiency [15ś17]. Thus, quadcopters are becoming prevalent and many control, planning, and perception methods have been assimilated for them [15, 18ś28].
Nevertheless, drones must operate under conditions that are quite different than any other compute-based agent. First, weight and power are restrictive parameters in drones. Second, drones must arbitrate between their limited compute, energy, and electromechanical resources not only based on the current tasks and local conditions (e.g., wind, air temperature), but also according to the flight plan. Despite huge technological advances in drones, these problems have been approached in isolation, and the end-to-end design-space tradeoffs are largely unknown.
As a result of such isolated problem solving, architecting end-toend drone systems and their computation landscape still remains an open question. For example (Figure 1), if we are making a special chip for drones, is it useful to improve processor performance and, if yes, is it because of energy savings or better control? How useful is improving processor power efficiency given that the majority of power consumption is coming from resources other than computing power? Should we focus on optimizing the flight-related tasks, or should we focus on secondary tasks such as recognition and autonomy? These questions pertain to creating cost-effective solutions with low system integration cost, reasonable development time, and effectiveness on drone metrics. Prior studies [29,30] have proposed a closed-loop simulator and benchmark suite, which does not completely answer the above questions because it is focused on high-speed drones (more in ğ6). To answer such questions and solve worthy research problems, we need to understand fundamental drone subsystems, classify drone computations and their requirements, extract design-space tradeoffs, and have access to a reproducible experimental platform. This is the first paper to formalize and quantify the design-space tradeoffs of autonomous drone systems. To do so, first, we address the lack of a publicly available and reproducible experimental drone framework that is customizable across its hardware-software stack by releasing our open-source drone. Then, by exploiting this new experience, we study the computational profile and landscape of such systems, in which we must understand three major drone metrics: flight time, control response time, and autonomous features. Despite the extensive knowledge in our community, we discover several missing pieces of the puzzle in understanding conventional, described in the following.
(1) Flight time: Flight time is determined by the power consumption of the drone during flight and the battery capacity. But, the power consumption is dependent on several factors: drone weight, motor types, flying activity, and several other factors. The battery capacity also directly affects weight; a larger battery is heavier, but has a higher capacity. (2) Control response time: The control response time of a drone is determined by its control system. However, we do not know if additional computation power would enhance this system. (3) Autonomous Features: With several exciting new applications in drones, such as machine learning applications, it is important to understand how they interact with the main control system, what their computation profile is, and know how to quantify any opportunity for optimization.
To answer the aforementioned questions for flight time, after introducing fundamental propulsion and power systems (ğ2.1.1, ğ2.1.2), we extract crucial metrics from over 300 commercial components and 150 manufacturers (ğ3.1) to find the major relationships in determining the weight and power consumption of drones (ğ3.2). Using the empirical measurements and physics, our method directly translates compute power efficiency to flight time by untangling the multifaceted relationships in drones. For instance, we quantify the percentage of computation power from total power widely ranges from 2ś30%, enabling gaining of up to +5 minutes flight time.
In ğ2.1.3, we analyze the control system of drones, namely, innerloop and outer-loop controls. For instance, we discover that the critical inner-loop controls in drones have an update frequency of 50ś500 Hz, which is not limited by computation power, but by the physical response of the drone. Finally, in ğ2.2, we shed light on the wide variety of autonomy in drones, current customized compute boards for drones, and discover that these systems are highly dependent on a core family of algorithms, namely, simultaneous localization and mapping (SLAM). Then, in Figure 2 in ğ3, we present several important tradeoffs in drones, including the computation power footprint. Next, in ğ4, we develop our open-source platform. Finally, in ğ5, we showcase optimizations for SLAM on various platforms. For instance, we show that moving from GPU/CPU to FPGA provides 20x power savings, enabling 15ś20% (+2ś3 minutes) of additional flight time in small drones. This is the first paper to contribute the following: • Formalizes the fundamental drone subsystems and quantifies the design-space tradeoffs for the computational profile of drones to discover how computation power consumption affects drone flight time, accomplished by incorporating physics and empirical measurements from 300 commercial components and 150 manufacturers. • Clearly separates the required computing for inner-loop controls (real-time requirements) vs. outer-loop controls (autonomous features) in drones and outlines the required computation amount and benefits gained. • Showcases the optimization landscape of the widely used SLAM algorithm in autonomous drones and the effects on flight time by using the presented data.
• Develops an open-source and reproducible platform with a customizable hardware-software stack to address the lack of publicly available drone platforms.

AUTONOMOUS DRONES
Autonomous drone subsystems determine several crucial properties of a drone, and the associated design choices have a pivotal impact on the effectiveness of the end-to-end system. However, each subsystem has been studied in isolation. This section first briefly introduces these subsystems, and then extracts necessary details pertaining to computations. Figure 3 overviews the main subsystems for a quadcopter drone. We divide the fundamental subsystems as follows: propulsion system, which generates necessary force for movement and lift; power delivery system, which delivers the power to electromechanical components; and control, compute, and acquisition system, which controls and stabilizes the drone with the help of sensors.   The physics behind the essential movements of quadcopters is relatively simple, shown in Figure 4. By ignoring several complexities, covered in ğ2.1.3, all movements stem from the precise control over each rotor's thrust while accounting for several environmental factors such as wind and air density. Drones use the same uplift thrust for horizontal movements by tilting. The maximum horizontal speed depends on the maximum stable angle of attack (i.e., tilt angle), which depends on the thrust-to-weight ratio (ğ2.3).

Power Delivery
System. Lithium polymer (LiPo) batteries (lithium-ion polymer), which have the highest energy density ( Wh /Kg) and discharge rate (measures how fast the battery can be safely discharged) in rechargeable lithium-ion technology, are the main source of power in drones. Since BLDC motors require high current, the high current flow of LiPo batteries is a critical factor. However, the downfall is that LiPo batteries are relatively fragile; only 85% (LiPo DrainLimit ) of their capacity should be used during a flight. LiPo batteries have various configurations that are multiples of cells with a nominal voltage of 3.7V/cell, studied in ğ3.1.
Each BLDC motor requires three-phase currents, which are generated by a separate ESC using DC current. The complexity of the ESC circuits is evident, as they need a switching frequency of 60ś 600 KHz while delivering hundreds of Watts. ESCs also provide necessary electronics to implement feedback to achieve precise control of the rotation speed of their own microcontroller. ESC protocols usually go beyond PWM (pulse-width modulation) signals for modern-day drones due to high precision in control (e.g., the DShot1200 protocol has a communication frequency of 74.6 KHz). The above criteria make ESCs one of the heavier components (ğ3.1).

Control, Compute, and Acquisition System.
A. Inner vs. Outer Loops: The recent advancements in autonomous drone systems have mainly been accomplished with the developments of high-level algorithms in state estimation, trajectory tracking, localization, and deep learning [18, 20, 21, 24ś26, 28]. Nonetheless, such high-level algorithms (i.e., outer-loop computations) rely on and are directly impacted by the inner-loop control [15,17,22,23] (Figure 6). High-level algorithms only provide state targets, grouped into position, velocity, and attitude 1 , to the inner-loop control. The inner loop reaches those target set points over time by direct manipulation of the drone actuators while also maintaining a stable flight. Furthermore, remote controller (RC) commands and safety override commands pass through the innerloop to minimize response latency. Table 1 summarizes a handful of dynamic effects, such as stabilization, that are compensated by the inner-loop control for a stable flight, emphasizing inner-loop control relative importance to the high-level algorithms. 1 Defined as orientation of a solid body around three Cartesian axes.  Figure 5 shows the hardwaresoftware stack abstraction of a drone. The flight controller boards with additional on-board sensors directly manipulate ESCs and sensors. Flight controllers have the following main components (Table 4 provides some examples): (i) a microcontroller (MCU) usually STM32F 32-bit Arm Cortex-M series; (ii) one or two 6-axis inertial measurement units (IMUs); (iii) a barometer, for altitude measurements; (iv) and possibly several chips for sensors, video feed codec, and communication. If necessary, external sensors with their dedicated full-stack supporting system are added. The operating system (OS) is dominated by Linux, except for racing applications. The hardware abstraction layer (HAL) provides necessary APIs. The shared libraries layer provides common sensor fusion algorithms (e.g., Extended Kalman Filter). The control layer is described in the next paragraph. The final application-specific flight code layer largely depends on the application. Finally, the communication layer delivers stats to the ground station and, if necessary, a MAVLink [31] protocol offloads computations to another node.
C. Inner-Loop Control: In the inner loop, the control layer uses the on-board sensors to stabilize the drone and reach to target set points dedicated by the outer loop. This layer extensively uses highperformance hierarchical proportional-integral-derivative (PID) controllers, whose filter response and quality of the estimated state variables defines the drone behavior. The feedback loop is shown in Figure 6 and is completed by sensor measurements. The control is performed hierarchically 2 by dividing the control problem into three levels depending on their response time, shown in Table 2b, known as time scale separation. The three levels are as follows: High-level position or trajectory, mid-level attitude, and low-level thrust controller [15,16,23,32]. Based on   and response frequency than 1 KHz is necessary, both for reading the sensors and updating the controllers.

D. Inner-Loop Control Computations:
We summarized the inner-loop control computation as two groups: (i) filter computations such as EKF for data fusion and updating PIDs, and (ii) algebraic functions for state estimation such as air drag and trajectory. The filter computations consist of keeping a history and accumulated versions of previously observed measurements, their derivative, and their integral. The state estimation includes 3x3 matrix operations based on the measurable state of the drone that includes = ì , ì , ì Ω, , in which ì ∈ IR 3 is the position (using data from the IMU, GPS, and barometer), ì is the velocity (using IMU), ì Ω ∈ IR 3 is the angular velocity (using IMU), and ∈ (3) is the attitude (using IMU) of the drone [17,32]. All control computations are effectively performed by a STM32F 32-bit Arm Cortex-M, a single-core processor with a frequency on the order of 100 MHz, in even high-speed racing drones. Although some research proposals suggest replacing control-theory-based with learning-based algorithms that require higher computation capabilities [27], the consensus is that unless a new electromechanical part introduces a drastically new response time, higher computation capabilities are not required. For instance, the inner-loop update frequency in high-end commercial products [33,34] ranges from 50 Hz to 500 Hz. Even for highly specialized sensor-based control techniques with incremental nonlinear dynamic inversion (INDI) that can stabilize a drone under powerful wind gusts [22], the update frequency is still 500 Hz. Thus, the update frequency of the inner-loop control is 50ś500 Hz, which is limited by the physical response time and inertia of the control and electromechanical components in drones and is not limited by the computation power. 2 Hierarchical controllers are non-linear controllers that yield stability and enhanced robustness, especially for highly nonlinear dynamics (e.g., air drag). Other linear and non-linear controllers for drones [16] also have a similar update frequencies.

Autonomy in Drones
Autonomy in drones is realized by intelligently providing target states (i.e., position, velocity, and sometimes attitude) with the computation that occurs in the outer-loop control, as explained in ğ2.1.3. Although autonomy is a defined term in self-driving vehicles, meaning to safely navigating from point A to B, such a definition is not set in drones. For instance, mapping drones are autonomous in the sense that they fly within a predefined airspace while covering the entire area for mapping [3,5,10,35]. Or, activefilming drones use vision cameras and recognition technologies to follow a predefined target and optimize the filming angles while avoiding obstacles [1,11,33]. As a result, autonomy in drones is still an active area of research and commercial products.
The outer-loop computations always occur in isolation, and the hardware dedicated for such computations varies from quad-core and Intel i7 CPUs in research-oriented studies [5,21] to custombuilt computers based on NVidia Jetson TX2 [33,36] or custom Intel boards (e.g., Intel Aero compute board [37]) in commercial products. From a computation perspective, to ensure that the innerloop control is in real time, the computations for autonomous tasks in the outer-loop are not co-located on the same computation core or even the same unit as for the inner-loop control.
We find a wide variety of high-level algorithms in autonomous drones are dependent on a core family of algorithms, namely, simultaneous localization and mapping (SLAM) and visual odometry (VO) [38ś42]. These algorithms are the fundamental building blocks for many autonomous technologies [19,39,43] and are used in various tasks such as navigation, obstacle avoidance, and path planning. Designing drone systems that provide accurate localization in realtime on platforms with limited computational and energy resources is an active area of research [18,19,24]. Therefore, to date, various implementations of SLAM with the focus on algorithmic-level optimizations [38, 40ś42, 44] or hardware acceleration [43, 45ś50] have been proposed. We explore such hardware optimizations in ğ5. Table 3 presents the the definition of metrics used in drones. Most of the metrics are not standalone and are dependent on each other based on the design choices covered in ğ3.

QUANTIFYING DESIGN-SPACE TRADEOFFS
To understand the computational profile in autonomous drones, we must quantify the design-space tradeoffs that define several important features such as flight time. This section quantifies each tradeoff correlation with weight, which defines the order of power consumption in a drone. Then, we derive how the computation affects this design space.

Important Tradoffs
Battery Stored Energy & Weight. Drones predominantly use LiPo batteries (ğ2.1.2), which constitutes a large fraction of a drone's weight. Although a higher capacity battery has more charge, the additional weight may result in a shorter flight time (not to mention the additional weight of sensors and computation units). Hence, it is crucial to understand the relationship between the capacity (mAh) and the weight of the batteries. Although the range of energy The maximum total thrust produced by the motors (g) divided by the drone's weight (g). Common ratios are from 2:1 to 7:1. A higher ratio enables drone to perform elaborate aerobatics. Higher ratios mean heavier motors, larger batteries, and ESCs with a higher consumption. To find the highest possible contribution boundary of computation power consumption, we use a TWR of 2:1.

Thrust Per Motor
Thrust that is produced by a motor depends on the propeller diameter and pitch, supply voltage, rating, and motor design. rating is used to calculate the rotation speed (RPM), , of the motor per supply voltage, = × . So, for a fixed voltage, a lower rating produces more torque and turns larger propellers. A propeller with a larger diameter and pitch moves a larger volume of air per rotor revolution and provides larger thrust. The maximum propeller size is determined by the frame size, or wheelbase.

Discharge Rate
The battery discharge rate or C rating is a measurement of the maximum continuous current a battery can safely supply. The maximum continuous current from C rating is calculated as

Battery (xSyP)
A LiPo battery has a nominal voltage of 3.7V/cell. To supply more voltage, cells are packed together in series. The convention is to write the configuration as xPyS, which means x pack of y cells in series. To provide a high thrust-to-weight ratio, we need motors with a lower rating for higher torque, which means a higher voltage is required to achieve good RPMs for lifting.

ESC
Max. Current ESCs must be able to supply constant current to the motors while the drone is flying. The maximum continuous current value shows how much current an ESC can handle, which directly depends on the type of motor and propeller.

Frame Wheelbase
The frame size or wheelbase is the distance between two diagonal arms of a quadcopter. The wheelbase defines the maximum propeller diameter a drone can use. Indoor drones have a wheelbase under 100 mm, while outdoor drones have wheelbases up to 1000 mm. densities of LiPo batteries are known, these values are insufficient for accurate estimations for two reasons. First, we are interested in the end product, which also includes casings, wires, and protection circuits. Second, as the manufacturing process is not ideal with various discharge rates, estimation based on energy density is not precise. To address this knowledge gap, we study 250 commercial batteries. By grouping the batteries based on their configuration in number of cells (see ğ2.3), we derived a linear relationship between the capacity and the weight of the batteries, shown in Figure 7.
Generally, for batteries with higher voltages (for motors with higher torque), we observe a higher overhead. However, these batteries are necessary to lift the drone. The figure also includes discharge rates, which result in heavier batteries, but the resulting weight does not deviate from the extracted formulas per battery configuration.  ESCs, targeting racing drones; and long-flight ESCs, targeting all other use cases. In racing, ESCs are designed with lighter MOSFETs and capacitors that overheat in longer flights.

Frames.
A larger drone frame size leads to more choices in the components, ability to house new sensors, and larger propellers. However, even with carbon and glass fiber technology, the weight of a frame is not negligible. Thus, we study 25 commercially available frames in Figure 8b and extract the correlation between their weight and wheelbase.
Propulsion System. The motors and propellers of drones have a wide variety of configurations; thus, the tradeoffs of the propulsion system are multifaceted and complex. The main deciding factor in the process is the target TWR. Since we are interested in understanding the computational profile in the most efficient designs, we set the target TWR to 2, the minimum required value for flying. Thus, the derived values specify the highest percentage of possible contribution of computation power. Figure 9 shows an extrapolated relationship between the max current draw of the appropriate motors and the corresponding drone's basic weight (i.e., not including battery, ESCs, and motor weight) grouped by the supply voltage (i.e., the cells of the LiPo battery). For each frame, we first set the maximum propeller diameter in inches dictated by the wheelbase (written in the legend). Then, we extract the thrust and rating of the motors from data released by 150 manufacturers. Then, by varying the weight and supply voltage, we calculate the minimum required max current draw per motor.  1S-50mm-1" 2S-50mm-1" 3S-50mm-1" 4S-50mm-1" 5S-50mm-1" 6S-50mm-1" 1S-100mm-2" 2S-100mm-2" 3S-100mm-2" 4S-100mm-2" 5S-100mm-2" 6S-100mm-2" 1S-450mm-10" 2S-450mm-10" 3S-450mm-10" 4S-450mm-10" 5S-450mm-10" 6S-450mm-10" Min  Figure 9: Relationship between the max per-motor current draw and the basic weight, grouped by supply voltage and wheelbase sizes from 50mmś800mm. TWR is 2 and data is extracted from 150 manufacturers.  [66] 1600 15W, Self-Powered Ultra Puck [67] 925 10W, Self-Powered In Figure 9, we see that heavier drones have motors with higher ratings for higher rotation speeds. Moreover, in larger wheelbases, larger propellers are needed to lift the drones. This is because it is physically impossible to use smaller propellers with high RPMs. These large propellers require higher torque from the motors. Thus, these motors have a lower rating (compare ratings in Figure 9a vs. b). However, because of their larger size (to create the necessary torque, the motors have a greater number of poles and larger diameters), these motors are much heavier (e.g., from 5 g/motor in 100 mm drones to 100 g/motor in 1000 mm drones).
Flight Controllers, On-Board Computation, & Sensors. Table 4 lists common open-source and commercial flight controllers, additional computation boards, and external sensors. All of the flight controllers have an integrated STM32F Arm Cortex-M processor series as the main inner-loop controller (ğ2.1.3). We divide the flight controllers into two groups: basic, which provides only necessary inner-loop functions with limited outer-loop capabilities; and improved, which provides customizable inner-loop functions and a few outer-loop functions. In commercial markets [33,36,37], the Nvidia Jetson TX2 embedded board is considered a high-end solution with a price of $300. The power consumption of these compute boards ranges from 0.5ś20 W. Therefore, in the following section, we assumed two levels of power consumption: a 3 W and a 20 W chip, representing basic and advanced flight controllers, respectively. For external sensors, we list the first-person view (FPV) cameras with a maximum of 1 W consumption. High-definition (HD) cameras are self-powered with weights around 100 g. Specific LiDAR solutions optimized for drone technologies are also listed in the table for completeness. All options are stand-alone and weigh around 1 kg. To make integration easier, state-of-the-art LiDAR solutions have their own battery and compute boards. We study how the addition of these sensors due to their weight, reduces the contribution boundary of main computation power in large drones.

Computation Footprint
Procedure: To understand the computational profile, we derived the total power consumption of a wide range of drones from small indoor drones (100 mm wheelbase) to large military and filming drones (800 mm wheelbase). We use ğ3.1 extracted data while accounting for the additional weight and power consumption of each module. In detail, per each frame (Figure 8b), we choose the propeller with the maximum size, find the required RPM for the motors, and choose the best matching motor depending on the number of cells in the LiPo battery, while sweeping the range in the capacity of the batteries from 1000mAh to 8000mAh (Figure 7, Equation 4). Then, from the maximum motor current draw ( Figure 9, Equation 2), we choose ESCs (Figure 8a). In this step, if the additional weights necessitate a new motor, we redo the previous steps (Equation 1). By assuming a low-load hovering condition (Flying Load , 20ś30% of the maximum current draw) with 85% LiPo battery capacity limit (LiPo DrainLimit ), we calculated the power consumption (Equation 7), shown in Figures 10a,b, and c for 100 Batt Capacity = M (LiPo Capacity , %Power Eff , %LiPo DrainLimit ) %Power Computation = X (Power Avg , Power Compute ) +FlightTime Compute = Z (%Power Computation , FlightTime)  Figure 10: Top Row (a,b,c): The total power consumption of drones with various wheelbases extracted by relationships in Equation 3.2 and verified with data from commercial drones shown as additional data points [33, 51ś55]. Bottom Row (d,e,f): The computation footprint considering 3 W and 20 W chips shown with 3/20W computation @ hovering/maneuvering lines.

Drone Weight (g) Drone Weight (g) Drone Weight (g)
Validation: We validate our data by adding commercial drone data points using the released flight times and battery configurations [33, 51ś55], shown as additional diamond-shaped data points in Figure 10. No data skewing or pre-selection is used for extracting tradeoffs (i.e., all data points are shown). Additionally, we verify the average power consumption by calculating the total flight time to match with current state-of-the-art commercial drones, resulting in 23, 19, and 21 minutes of flight time for 100, 450, and 800 mm wheelbases, respectively. Interpretation: Figures 10d,e, and f illustrate the percentage of computation power from the total power of a drone in two groups with hovering and maneuvering (20ś30% and 60ś70% of the maximum current draw, respectively). The first group with a 3 W compute power represents a commercial ultra-low-power flight controller. The second group with a 20 W compute power represents a GPU-CPU system with much higher capabilities. First, we see that the 3 W chips have less than 5% contribution in total power consumption. Second, even for the 20 W system, when the drone moves, the contribution drops to an average of 10%. Moreover, we see jumps that occur because heavier drones need batteries with more cells to provide higher voltage for higher KV motors. However, initially, those batteries are less efficient than the batteries used for lighter drones. Also note that these drones have a target TWR of two; hence, the contribution shown is at its highest. To quantify, we can convert this power savings to extra gained flight time (see Equation 7. In large-to medium-sized drones, the average computation power is 10% of the total power and the maximum gain of computation power savings is with +2 minutes in total flight time and possibly less considering maneuvering and higher TWR values. For small-sized drones, the tradeoff between the computation and flight power is more critical. In addition to Figures 10a and d, we also study the power consumption of nano and micro commercial drones' power consumption, outlined in Figure 11 [33,51,53,55,68,69]. For these drones, when hovering, the power consumption is from 2ś7%. Nevertheless, when hovering with heavy computations (e.g., face recognition, HD video recording), the contribution of computations in total power consumption reaches 10ś20% (shown with a yellow line in Figure 11). Thus, in small drones, by optimizing heavy computations such as SLAM and deep learning workloads, we can potentially increase the flight time by up to 20%, or around +5 minutes in total flight time. How to Use This Data: Figure 12 illustrates the procedure for how to obtain the total and compute power consumption of a drone depending on its size and battery capacity. Thus, we can understand how power savings or special chips affects the flight times and weights for all drones. We showcase SLAM in ğ5 as an example.  To address the lack of publicly available end-toend experimental and reproducible frameworks for drones, we develop and fly test a fully open-source experimental drone that is fully customizable across its hardware-software stack. We integrate several widelyused hardware/software components. This platform reduces the barriers to entry to drone research 3 , shown in Figure 13, and has a total cost of $500 with the ability to carry 200g additional payloads. The components, as far as they are compatible (e.g., voltage, connections), can be easily switched/added to the drone. The current available alternatives do not provide full access to the hardwaresoftware stack and have no extra-weight capacity.
Hardware-Software Ecosystem. With the backing of the Navio2 [61] for crucial flight inner-loop control, our drone uses a Raspberry Pi [62] (RPi) with a maximum power of 5W. We integrate highlevel autonomous flying firmware [70] to run advanced waypoint navigation algorithms and autonomously execute certain actions based on the results of the SLAM algorithm [71]. The following sections overview the four layers of the drone's ecosystem.
(1) High-Level Functions: The high-level functions layer consists of high-level and low-level APIs which are used to write custom code and firmware. The custom firmware is converted to a Linux service and run on the Pi in the background. We also utilized the DroneKit [72] C++ and Python APIs, which were modified to allow the drone to be reconfigured mid-flight. DroneKit allows us to connect to the drone, issue flight commands, and monitor the drone. Apart from being open-source, DroneKit is easily extensible and provides the flexibility to be used on on-board computers as well as ground-station applications by abstracting away physical MAVLink [31] protocols.
(2) Autopilot: ArduCopter [70] is an open-source autopilot code-base for drones with great versatility. ArduCopter, written in C++, allows for manual flying/autonomous control. Our modified Linux kernel allows ArduCopter to utilize loop-back ports to listen to commands being issued by external applications executing on other computers (e.g., RPi). The ArduCopter binary, once compiled with WAF [73], runs several Linux daemons [74] with distributed roles.
(3) Modified Linux Kernel: The Linux kernel is modified to support the Preempt_RT patch, which enables the Linux operating system to become suitable for drones. Using this, we can completely shut down an instance of a drone mission and spool up a new mission while the drone is in mid-flight, safely and securely using WAF [73]. The Linux kernel is also modified to support continuous loop-back and server instances so the drone can be controlled using multiple devices such as through 915 Mhz telemetry or a laptop through Secure Shell (SSH).  (4) Flight Controller: We use the Navio2 controller with a Cortex-M3 coprocessor, GPS, and 2x IMUs. Navio2 has generic GPIO pins for any compute board and provides connection to our RPi. During flight, the RPi sends signals to the board that are decoded by the controller.
(5) Hardware Control Surfaces: The controllable hardware consists of sensors, four motors, and ESCs. The weight breakdown of our drone is shown in Figure 14, which shows similar trends as shown in ğ3.1. The frame, battery, motors, and ESCs are the major components contributing to the weight.
A New Platform Different From Current Platforms. Several popular commercial drones such as the CrazyFlie [75] or the PlutoX [76] have drastic tradeoffs between performance and flight time while limiting user access to flight code or being unable to carry additional payloads. Moreover, they can be configured only for limited purposes. With our drone, our goal is to minimize that tradeoff and give users the power to import both high-and low-level (i.e., inner-and outer-loop) functions. Our drone can be configured for a variety of research purposes because the hardware stack is configurable. Moreover, we use Linux with the RT-Preempt patch to allow for a wide range of applications while enabling the control of the drone and parameters in real time. We choose the Navio2 flight controller because it is easily configurable for different applications and grants complete access to all control systems.

SHOWCASING OPTIMIZATIONS
This section exhibits the impacts of design optimizations on performance and power consumption and concludes with the impact of optimization on flight time. To study this, we explore offloading ORB SLAM onto various hardware platforms.
Experimental Setup & Platforms. Our baseline platform to execute autopilot and SLAM (ORB SLAM [71]) is a RPi4 [62]. We measure the power consumption of the RPi using a USB digital multimeter that records measurements once every half second (± 10 mW). The power consumption of the entire drone is measured with a digital oscilloscope by measuring both current and voltage every 20 ms (± 0.5 mW) of the battery while controlling the drone. To measure performance at the instruction level, we used Linux perf and carried out analysis while the entire software-hardware stack is in loop. Our hardware platforms for implementing SLAM include a separate RPi4 [62], Nvidia Jetson TX2 [63,71,77], and a ZYNQ XC7Z020 FPGA on a PYNQ-Z1 embedded board. All the SLAM experiments run with the relevant EuRoC micro aerial vehicle dataset [78], while confirming SLAM key metrics.
For FPGA implementation, we use Xilinx Vivado HLS and describe our tailored microarchitecture in C++ by using relevant #pragrma. We use the post-implementation resource utilization, power consumption, and latency reported by Vivado. Inputs and outputs of the accelerator are transferred through the AXI stream interface. The clock frequency is set to 100 MHz. Similarly, we use the Eu-RoC dataset. For ASIC comparisons, we use the 20 mm 2 Suleiman et al. implementation on ASIC, in 65nm CMOS [19]. Navion is a visual-inertial odometry (VIO) accelerator that does not include the full-loop feedback of SLAM; nevertheless, it offers the order of power consumption in ASIC implementations. Navion processes the EuRoC dataset in real-time at 20 frames per second (FPS) while consuming a maximum of 24 mW.

Running Autopilot and SLAM on RPi
Performance. When running SLAM along with the autopilot on an RPi, SLAM in not only not fast enough, but also it negatively impacts the performance of the autopilot. For instance, the presence of SLAM causes 4.5× as many TLB misses as the autopilot alone  causes. Similarly, we observed that the LLC and branch-prediction miss rates of the autopilot with SLAM are also higher than those when running the autopilot solely, as the primary axis in Figure 15 shows. Additionally, as the secondary axis in Figure 15 shows, the IPC of the autopilot decreases by 1.7×. These observations indicate that by running a few additional workloads, specifically heavy ones, the real-time response of the autopilot will lag and we will miss several outer-loop deadlines. Although the outer-loop control is not directly related to the control system, improving the performance of processors is necessary to handle heavy computations that are introduced by new workloads.
Power Consumption. Figure 16a shows the power consumption graph of the RPi during flight. We measure the power consumption of the RPi while it is executing the autopilot software, SLAM, and flight script (i.e., pre-set commands for autopilot). The average power consumption of the RPi when executing the autopilot is 3.39 W, which increases to 4.05 W when we start SLAM, but the drone is not flying yet (SLAM is idle). Finally, when the drone flies and SLAM actively processes input data, up to 5 W of power is consumed and the average power consumption of RPi reaches 4.56 W ś we use these numbers to estimate heavy computation power consumption in Figure 11. Thus, by offloading SLAM onto a low-power platform such as ASIC/FPGA, we can potentially save up to 2 W, which would have a high impact for small drones (e.g., Parrot Mambo [68]). Figure 16b depicts the power consumption graph of the entire drone, with an average of 130 W. In Figure 10, this 130 W is only with 30% of the flying load. The power consumption goes as high as 250 W in higher loads (58% flying load) with simple movements. In maneuvering (Figure 10dśf), the contribution of computation power consumption reduces significantly.

Offloading SLAM to Hardware Accelerators
Besides preventing lags in the responses of the autopilot, offloading SLAM to a hardware accelerator (i) improves the performance of SLAM and (ii) helps extend the flight time by consuming less power. This section explores these two aspects by implementing SLAM on our three hardware platforms.
SLAM performance. Figure 17 shows the time to process each Eu-RoC dataset while executing ORB SLAM on a RPi4 (with no other application), TX2, and FPGA. Our FPGA implementation extensively accelerates the local and global bundle adjustments of ORB SLAM (≈90% of execution time on RPi) by using simple modules of dense fixed-size matrix algebra in a pipeline. For further acceleration, we also integrate eSLAM design [50], which accelerates feature extraction. Running SLAM on a separate RPi improves its  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2  FPGA  TX2 Figure 17: ORB SLAM speedup over RPi for TX2 and FPGA by category: feature extraction and bundle adjustments.
performance by 2.3× (IPC from Figure 15). As Figure 17 illustrates, the TX2 and FPGA implementations are 2.16× and 30.7× faster than the implementation of SLAM on the RPi. As a result, all these implementations, including the slowest, meet the rate of sensors (e.g., cameras and LiDARs), even those with more than 100 FPS. Although all design choices satisfy the real-time requirement, they provide a 400x landscape in power consumption. Therefore, the question is, how should we navigate 400x landscape in power consumption?
Flight time. Offloading SLAM (or any other heavy compute) to a hardware accelerator reduces power consumption, but adds weight to the system. This section explores the combined effect on flight time. The power consumption of our FPGA implantation is 417 mW, compared to ASIC with 24 mW [19]; RPi with 5 W; and TX2 with 10 W. Since, on average, our drone consumes 130 W, saving 10 W by moving from TX2 to FPGA gives us +1 minute of flight time (≈ 10 /140 × 15 min). For small drones moving from CPU/GPU to FPGA with 20x in power savings, there is a reduction on the power consumption of approximately 15-20%, enabling an additional +2ś3 minutes of flight time (≈ 10 /50 × 15 min). But, the lengthy process of special ASIC chip fabrication to gain an additional 20x power savings (saving 400 mW) earns us only a few seconds. Table 5 combines the results for the cost of various platforms for executing SLAM on drones by assuming RPi as the baseline. Since TX2 consumes more power and is heavier, the gain in flight time is negative. Both FPGA and ASIC have almost identical impacts on flight time for large drones and small drones (only 20 seconds in additional flight time for ASIC in small drones). However, ASIC integration and fabrication costs are extremely high, which renders FPGA as the best platform even though it consumes more power.

RELATED WORK
Prior studies [29,30] have proposed a closed-loop simulator and benchmark suite for autonomous tasks in drones, mainly focusing on outer-loop tasks, which is not the main focus of this paper. The discussions only pertain to high-speed drones. In contrast to the assumptions made, we argue that, first, the mission planning computation does not increase hovering time since mission planning has relaxed deadlines [79]. Even in high-speed, indoor, and cluttered environments, new algorithms have been proposed to enable fast planning [21,28]. Second, collision detection does not necessarily require heavy computations (e.g., using laser-range, infrared, or RGBD sensors, or even microcontroller) [80ś83]. Third, localization is a highly active research area and does not necessarily limit current drone speeds (e.g., real-time odometry and NASA JPL's autonomous racing) [14,24,84]. Finally, described conclusions in [29,30] are based on maximum drone acceleration, the value of which is not readily known from the specifications. Authors have early versions of this work published [85,86].

CONCLUSIONS
This is the first paper that (i) formalized fundamental drone subsystems and quantified how computation power consumption varies in drones and affects the design-space parameters such as flight time; (ii) studied required computing for inner-loop control; and (iii) proposed an open-source drone framework and explored the acceleration landscape of SLAM, while motivating further research within the community. We found that although the outer-loop control is not directly related to real-time control systems due to the nature of heavy computation, it has to consider deadlines; and thus improving the performance of processors is important. For the inner-loop which controls real-time hardware, the amount of computation is relatively low, so low-end embedded computing platforms are satisfactory. However, due to the critical nature of the inner-loop control, all drones have dedicated processors for it. We found that for small drones, improving power efficiency is translated into an increase in flight time, but for heavy drones (>≈2 kg), the improvement in power efficiency does not have an effect. Therefore, FPGA implementations provide the most costeffective solution for small and large drones. It is worth mentioning that the studied tradeoffs are different for nano and pico drones with a total power consumption of 100 mW [19, 87ś90]. We did not focus on such drones because these drones are extremely customized (from physics to material sciences), so it was not possible to study them within the same framework. Furthermore, we used the minimum TWR of 2. A detailed evaluation for other TWR values can be done in a similar way, released in our repository, which results in a lower contribution of computation power consumption.

A ARTIFACT APPENDIX A.1 Abstract
This artifact describes our open-source experimental drone framework that is customizable across its hardware-software stack. The main portion of the artifact focuses on building the drone, which compliments the beginning sections of the paper. The build guide consists of two parts: hardware and software. The hardware guide presents a list of required hardware components (accessible to anyone) following by a step-by-step assembly guide. The software component provides the firmware of the drone and enables users to execute any software that is supported on Linux. We provide the necessary packages and configuration of the software setup. Finally, as an example, we provide simple scripts for perf metrics measurements while describing energy consumption measurements (requires an oscilloscope with high-frequency data logging and 30A current probes). Note: For some artifacts, we provide two links: (1) The original link of the software by the provider; and (2)

A.4 Installation
A.4.1 Drone Assembly. The first steps are to assemble the drone. An overview of the instructions are given below. For a more detailed build guide with pictures please see /BuildGuide.
• First assemble the PI + NAVIO. Plug in the HAT into the GPIO pins on the RPI. • Solder the bullet connectors onto the motor connections.
• Solder the battery connector onto the Power Distribution Board (PDB). • Screw in the legs of the frame.
• Screw in the top plate to the frame.
• Attach motors to the frame according to the rotational direction listed in the motor manual. • Use double sided tape and attach Raspberry pi + NAVIO to drone top plate. • Use double sided tape and attach the PPM encoder to the frame.
• Connect the battery connectors to the PPM encoder.
• Stick the RC receiver onto the frame.
• Connect receiver to the NAVIO.
• Connect PPM outputs to NAVIO. • Use zip ties and attach ESCs to the bottom of the legs.
• Assemble the GPS mount and zip tie it into the back-right leg (Note : GPS unit must point North-South). • Attach GPS on mount and connect GPS to NAVIO.
• Connect ESCs to motors and ESCs pwm to PPM encoder.
• Connect battery to battery connector.
• Finally connect TELEM module to HAT and stick module onto frame.
A.4.2 Drone Software Configuration. After building the drone, the following software steps are needed to download and configure the software stack on the drone.
• Download the Emlid OS from here or /EmlidOS. • Flash the downloaded .iso file to the MicroSD card (You can use Etcher as a tool) and insert it into the Pi. • Follow the first time setup community guide of Arducopter here or /ArducopterWiki under łFirst Time Setup. ž • Next, it is critical to configure and calibrate the sensors and IMU.
• Configure autopilot to load on boot : $sudo emlidtool -on_boot=True. • Review DroneKit docs or /DroneKitDocs to see how to use API.
• To spool up Arducopter, run $sudo systemctl daemon-reload and then run $sudo systemctl restart arducopter. • Note: RCIO Worker is a background helper service for Arducopter and automatically starts when Arducopter is started.
A.4.3 Setting up and Configuring SLAM.
• Clone our Github repository or /ParallelML-Drone, and change directory to slam $cd drone/slam • Download a sample image data set (here) or /EuroC-MH01Easy.
• Extract the data set in the slam directory.
• SLAM is now running in the background.
• To stop SLAM run $docker-compose down.

A.5 Experiment Workflow
With a fully working drone, this section describes and provides simple scripts for measuring performance metrics (any performance metric that is available to perf tool).
This repository or /ParallelML-Drone contain all the required files (and a full backup of our SD card). Specifically, shell scripts perf_ardu_slam.sh and perf_ardupilot_loop.sh execute simple experiments for Ardupilot and SLAM, respectively. Directory boot_pi_backup/ contains a backup of our SD card. To use this version, copy the files to SD card and rename it to boot.

A.6 Evaluation and Expected Results
Performance Metric Measurements: Execute above scripts by passing the PIDs of ArduCoptert, RCIO_Worker, and SLAM (in this order). Then, the scripts print several metrics for branches, cache operations, and virtual memory management. The exact flags depend on the particular architecture and we have fine-tuned them for Raspberry Pi 3B+. Energy Measurements: To perform energy measurements an oscilloscope with high-frequency data logging and 30A current probes is required. The current probes are used to measure the current on the input power wires from the LiPo battery. To measure energy (or energy/second), another probe measures the voltage of the battery. By setting the oscilloscope function to multiply these measurements, we can log energy per second of the entire drone. To distinguish between Raspberry Pi, additionally, an in-loop USB power meter to measure Raspberry Pi power consumption is needed. Non-flight measurements can be done while the drone is not active. For flight-related measurements, flip the propellers so the drone pushes down (while consuming a similar amount of energy). Paper Graphs: You can find the raw data from which the graphs are constructed at /Drone-CSVs.

A.7 Experiment Customization
Users are free to change any part of firmware or write their own application for the drone. Additionally, users may add any new sensors or hardware components that is compatible with Raspberry Pi or its GPIO protocols (e.g., I2C).