| Reduce energy consumption | [98] | Distributed DL-based offloading algorithm | Add the cost of changing local execution tasks in the cost function |
| Reduce latency | [88] | Smart-Edge-CoCaCo algorithm based on DL | Joint optimization of wireless communication, collaborative filter caching and computing offloading |
| [89] | A heuristic offloading method | Origin-destination electronic communication network distance estimation and heuristic searching to find optimal strategy for shorting the transmission delay of DL tasks |
| | [54] | Cooperative Q-learning | Improve the search speed of traditional Q-learning |
| | [90] | TD learning with postdecision state and semi-gradient descent method | Approximate dynamic programming to cope with curse-of-dimensionality |
| | [91] | Online RL | Special structure of the state transitions to overcome curse-of-dimensionality; additionally consider the EC scenario with energy harvesting |
Computing offloading optimization | Reduce both energy consumption and latency | [93] | DRL-based offloading scheme | No prior knowledge of transmission delay and energy consumption model; compress the state space dimension through DRL to further improve the learning rate; additionally consider the EC scenario with energy harvesting |
| [94] | DRL-based computing offloading approach | Markov decision process to represent computing offloading; learn network dynamics through DRL |
| | [95] | Q-function decomposition technique combined with double DQN | Double deep Q-network to obtain optimal computing offloading without prior knowledge; a new function approximator-based DNN model to deal with high dimensional state spaces |
| | [10] | RL based on neural network architectures | An infinite-horizon average-reward continuous-time Markov decision process to represent the optimal problem; a new value function approximator to deal with high dimensional state spaces |
| Optimize the hardware structure of edge devices | [102] | Binary-weight CNN | A static random access memory for binary-weight CNN to reduce memory data throughput; parallel execution of CNN |
[104] | DNNs | FPGA-based binarized DNN accelerator for weed species classification |
Other ways to reduce energy consumption | Control device operating status | [105] | DRL-based joint mode selection and resource management approach | Reduce the medium- and long-term energy consumption by controlling the communication mode of the user equipment and the light-on state of the processors |
| Combine with energy Internet | [106] | Model-based DRL | Solve the energy supply problem of the multi-access edge server |
[70] | RL | A fog-computing node powered by a renewable energy generator |
| | [113] | Minimax-Q learning | Gradually learn the optimal strategy by increasing the spectral efficiency throughput |
| | [114] | Online learning | Reduced bandwidth usage by choosing the most reliable server |
| | [115] | Multiple AI algorithms | Algorithm selection mechanism capable of intelligently selecting optimal AI algorithm |
Security of edge computing | | [117] | Hypergraph clustering | Improve the recognition rate by modeling the relationship between edge nodes and DDoS through hypergraph clustering |
| | [112] | Extreme Learning Machine | Show faster convergence speed and stronger generalization performance of the Extreme Learning Machine classifier than most classical algorithms |
| | [56] | Distributed DL | Reduce the burden of model training and improve the accuracy of the model |
| | [120] | DL, restricted Boltzmann machines | Give active learning capabilities to improve unknown attack recognition |
| | [122] | Deep PDS-Learning | Speed up the training with additional information (e.g., the energy utilization of edge devices) |
Privacy protection | | [124] | Generative adversarial networks | An objective perturbation algorithm and an output perturbation algorithm that satisfy differential privacy |
| | [125] | A deep inference framework called EdgeSanitizer | Data can be used to the maximum extent, while ensuring privacy protection |
| | [77] | Deep Q-learning | Derive trust values using uncertain reasoning; avoid local convergence by adjusting the learning rate |
Resource allocation optimization | | [166] | Actor-critic RL | An additional DNN to represent a parameterized stochastic policy to further improve performance and convergence speed; a natural policy gradient method to avoid local convergence |
| [76] | DRL-based resource allocation scheme | Additional SDN to improve QoS |
| | [127] | Multi-task DRL | Transform the last layer of DNN that estimates Q-function to support higher dimensional action spaces |