TY - GEN
T1 - RLBrowse
T2 - 2022 IEEE/IFIP Network Operations and Management Symposium, NOMS 2022
AU - Griessel, Alexander
AU - Stephan, Maximilian
AU - Mieth, Martin
AU - Kellerer, Wolfgang
AU - Kramer, Patrick
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Automated Web Browsing tools, such as Selenium and headless browsers, are used to collect traffic traces from networked applications, with which statistical models describing the traffic are obtained. However, we show that traces from Selenium and headless browsers have markedly different traffic characteristics than human generated traces, with potential impact on the quality of the obtained models. To overcome this limitation, we propose RLBrowse, an automated web automation framework that imitates human browsing habits by separating web automation from the browser using reinforcement learning. By separating the browser and automation tool, RLBrowse improves on 9 out of the 13 traffic trace features tested. The distribution of packet sizes in a trace improves the most, with a nearly 400 % improvement. We test RLBrowse by collecting a corpus of network packet traces on a set of human-navigated website browsing sessions, and by RLBrowse and Selenium. In the subsequent analysis, we identify key differences in the resulting packet traces.
AB - Automated Web Browsing tools, such as Selenium and headless browsers, are used to collect traffic traces from networked applications, with which statistical models describing the traffic are obtained. However, we show that traces from Selenium and headless browsers have markedly different traffic characteristics than human generated traces, with potential impact on the quality of the obtained models. To overcome this limitation, we propose RLBrowse, an automated web automation framework that imitates human browsing habits by separating web automation from the browser using reinforcement learning. By separating the browser and automation tool, RLBrowse improves on 9 out of the 13 traffic trace features tested. The distribution of packet sizes in a trace improves the most, with a nearly 400 % improvement. We test RLBrowse by collecting a corpus of network packet traces on a set of human-navigated website browsing sessions, and by RLBrowse and Selenium. In the subsequent analysis, we identify key differences in the resulting packet traces.
UR - http://www.scopus.com/inward/record.url?scp=85133211192&partnerID=8YFLogxK
U2 - 10.1109/NOMS54207.2022.9789851
DO - 10.1109/NOMS54207.2022.9789851
M3 - Conference contribution
AN - SCOPUS:85133211192
T3 - Proceedings of the IEEE/IFIP Network Operations and Management Symposium 2022: Network and Service Management in the Era of Cloudification, Softwarization and Artificial Intelligence, NOMS 2022
BT - Proceedings of the IEEE/IFIP Network Operations and Management Symposium 2022
A2 - Varga, Pal
A2 - Granville, Lisandro Zambenedetti
A2 - Galis, Alex
A2 - Godor, Istvan
A2 - Limam, Noura
A2 - Chemouil, Prosper
A2 - Francois, Jerome
A2 - Pahl, Marc-Oliver
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 April 2022 through 29 April 2022
ER -