English  |  正體中文  |  简体中文  |  Items with full text/Total items : 27855/29356 (95%)
Visitors : 39157393      Online Users : 4196
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://ir.lib.cyut.edu.tw:8080/handle/310901800/43041


    Title: 一個用以提高入侵偵測系統性能的多動態特徵選擇機器學習方法
    A Machine Learning Based Multi-Dynamic Feature Selection Scheme to Improve the Performance of IDS
    Authors: 默藍里
    AHMAD, RAMLI
    Contributors: 資訊管理系
    李麗華
    LI, LI-HUA
    Keywords: 入侵偵測系統(IDS);機器學習(ML);合成少數群過度採樣技術 (SMOTE);多動態特徵選擇 (MDFS)
    Intrusion Detection System (IDS);Machine Learning (ML);Synthetic Minority Oversampling Technique (SMOTE);Multi-Dynamic Feature Selection (MDFS)
    Date: 2024-03-01
    Issue Date: 2024-04-18 10:13:25 (UTC+8)
    Abstract: 入侵偵測系統(IDS)對於偵測惡意活動和警告即將發生的威脅非常有用。當入侵偵測系統監控到網路流量有違規行為或威脅時,系統就會發出警報。常見的入侵偵測系統有三個常見的步驟,即監視、偵測和警報等三個步驟。  由過去的研究可看出入侵偵測系統經常面臨許多挑戰,特別是在入侵偵測系統所收集到的資料類型經常是分佈不均的,常見的就是正常的流量數據和惡意或攻擊數據有很大的數量落差,這樣的情形也叫資料不平衡。這種不平衡的數據資料集,容易使訓練的模型產生偏差,並導致模型在進行偵測時,對於真正攻擊的資訊判斷不佳。另一個挑戰則是入侵偵測系統所蒐集到的數據集會包含許多雜訊或不相關的資訊,這些資料可能會影響機器學習模型偵測入侵的效能。為了克服上述的問題,過去學者們也提出不少解決的機器學習模型或方法。有關數據不平衡的問題仍然是目前研究學者們仍積極克服的議題,相關做法包括資料預處理時抽取較少樣本(under-sampling)方式、利用數據增強(data augmentation)方式或利用過度採樣(oversampling)方式等來減輕資料集裡的不平衡情形,減少資料的分佈不均才能提高機器學習模型對惡意入侵的偵測可能。除了資料預處理,特徵選擇(FS)方法也是用來識別入侵資料很重要關鍵研究,因為特徵選擇(FS)方法能有效減少數據集的雜訊資料、降低資料維度並提高模型計算效能和準確性。由上可知,選擇合適的資料預處理技術、特徵選擇方法和機器學習演算法至關重要。本研究的目的是提出一個通用的「多動態特徵選擇(MDFS)」技術,用以降低數據維度並提高入侵偵測系統中機器學習模型的性能。本研究將多動態特徵選擇與機器學習模型相結合,針對三個完全不同的網路資料數據集:KDD Cup 99、CICIDS 2017和UNSW NB15進行研究和實驗。由本研究所提的方法經實驗結果證實,本研究提出的整合模型能有效地偵測入侵攻擊,所獲得的準確率和F1值都優於過去其它學者的研究結果。由此可知,本研究所提的多動態特徵選(MDFS)技術確實能提高入侵偵測系統的效能,即便在三個不同的數據資料集中,依然能克服不同數據集的資料不平衡內容和雜訊問題,達成提升效能的成果。
    The Intrusion Detection System (IDS) is very useful for detecting malicious activities and warning about impending threats. When IDS monitor network flow and detects policy violations or threats, alerts will be triggered. In general, monitoring, detection, and alerting are three typical steps of the IDS process. Past studies have shown that there are many challenges for IDS. One of the challenges is data imbalance, which contains a big variation amount of data between normal and malicious data. This kind of imbalanced data will create a biased model and cause the model to perform poorly in detecting minority classes or malicious attack. Another challenge is the noisy or irrelevant data in the IDS datasets, which can impact the performance of Machine Learning (ML) models in detecting the intrusion.To overcome these challenges, researchers have proposed many machine-learning models or solutions. Usually, data preprocessing is the critical step in handling data imbalance problems. These include approaches like under-sampling, data augmentation, and oversampling which can reduce the imbalanced distribution and improve the performance of ML model. In addition to data preprocessing, Feature Selection (FS) methods are also a critical in recognizing important features, reducing noise in the dataset, and improving model Accuracy. Based on these steps, it is evident that a good IDS should incorporate appropriate data preprocessing technique, proper FS methods, and high-performance ML algorithm(s). This research aims to improve the performance of ML models in IDS by reducing data imbalance and dimensionality using Multi-Dynamic Feature Selection (MDFS) techniques. Combining MDFS with ML can increase the performance of IDS in addressing challenges in three different datasets: KDD Cup 99, CICIDS 2017, and UNSW NB15. Based on the experimental results, this research shows that the proposed model can detect attacks better when compare with other researchers’ model. The experimental results are confirmed that the Accuracy value and F1-Score of this research obtained are better than the Accuracy value and F1-Score in other studies.
    Appears in Collections:[資訊管理系、資訊科技研究所] 博碩士論文

    Files in This Item:

    File Description SizeFormat
    112CYUT0396005-001.pdf2792KbAdobe PDF192View/Open


    All items in CYUTIR are protected by copyright, with all rights reserved.


    著作權政策宣告
    1.本網站之數位內容為朝陽科技大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度、合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    2.本網站之製作已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(yjhung@cyut.edu.tw),維護人員將立即採取移除該數位著作等補救措施。

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback