Search or add a thesis

Advanced Search (Beta)
Home > Temporal Human Action Detection in Long and Untrimmed Videos

Temporal Human Action Detection in Long and Untrimmed Videos

Thesis Info

Access Option

External Link

Author

Murtaza, Fiza

Program

PhD

Institute

University of Engineering and Technology

City

Taxila

Province

Punjab

Country

Pakistan

Thesis Completing Year

2019

Thesis Completion Status

Completed

Subject

Computer Science

Language

English

Link

http://prr.hec.gov.pk/jspui/bitstream/123456789/12248/1/Fiza%20Murtaza_Comp%20Vision_2019_UET%28T%29.pdf

Added

2021-02-17 19:49:13

Modified

2024-03-24 20:25:49

ARI ID

1676727846125

Asian Research Index Whatsapp Chanel
Asian Research Index Whatsapp Chanel

Join our Whatsapp Channel to get regular updates.

Similar


With the advancement in information and communication technologies, sensing devices have now become pervasive. The pervasiveness of camera devices has enabled recording of video data at anytime and anywhere. It gives rise to a massive amount of untrimmed video data being produced, which consist of several human-related activities and actions including some background activities as well. It is important to detect the actions of interest in such long and untrimmed videos so that it can be further used in numerous applications i.e., video analysis, video summarization, surveillance, retrieval and captioning etc. This thesis targets temporal human action detection in long and untrimmed videos. Given a long and untrimmed video, the task of the temporal action detection is to detect starting and ending time of all occurrences of actions of interest and to predict action label of the detected intervals. Detecting human actions in long untrimmed videos is important but a challenging problem because of the unconstrained nature of long untrimmed videos in both space and time. In this work we solve the temporal action detection problem using two di erent paradigms: \proposal + classi cation" and \end-to-end temporal action detection". In proposal + classi cation approach, the regions which likely to contain human actions, known as proposals, arerst generated from untrimmed videos which are then classi ed into the targeted actions. To this end, we propose two di erent methods to generate action proposals: (1) un-supervised and (2) supervised temporal action proposal methods. In therst method, we propose unsupervised proposal generation method named as Proposals from Motion History Images (PMHI). PMHI discriminates actions from non-action regions by clustering the MHIs into actions and nonaction segments by detecting minima from the energy of MHIs. The strength of PMHI is that it is unsupervised, which alleviates the requirement for any training data. PMHI outperforms the existing proposal methods on the Multi-view Human Action video (MuHAVi)- uncut and Computer Vision and Pattern recognition (CVPR) 2012 Change Detection datasets.PMHI depends upon precise silhouettes extraction which is challenging for realistic videos and for moving cameras. To solve aforementioned problem, we propose a supervised temporal action proposal method named as Temporally Aggregated Bag-of-Discriminant-Words (TAB) which work directly on RGB videos. TAB is based on the observation that there are many overlapping frames in action and background temporal regions of untrimmed videos, which cause di culties in segmenting actions from non-action regions. TAB solve this issue by extracting class-speci c codewords from the action and background videos and extracting the discriminative weights of these codewords based on their ability to discriminate between these two classes. We integrate these discriminative weights with Bag of Word encoding, which we then call Bag-of-Discriminant-Words (BoDW). We sample the untrimmed videos into non-overlapping snippets and temporally aggregate the BoDW representation of multiple snippets into action proposals. We present the e ectiveness of TAB proposal method on two challenging temporal action detection datasets: MSR-II and Thumos14, where it improves upon state-ofthe- art methods. \Proposal + classi cation", requires multiple passes through testing data for these two stages, therefore, it is di cult to use these methods in an end-to-end manner. To solve this problem, we propose an end-to-end temporal action detection method known as Bag of Discriminant Snippets (BoDS). BoDS is based on the observation that multiple actions and the background classes have similar snippets, which cause incorrect classi cation of action regions and imprecise boundaries. We solve this issue bynding the key-snippets from the training data of each class and compute their discriminative power which is used in BoDS encoding. During testing of an untrimmed video, wend the BoDS representation for multiple candidate regions andnd their class label based on a majority voting scheme. We test BoDS on the Thumos14 and ActivityNet datasets and obtain state-of-the-art results.
Loading...
Loading...

Similar Books

Loading...

Similar Chapters

Loading...

Similar News

Loading...

Similar Articles

Loading...

Similar Article Headings

Loading...

آن لائن لیکچر

آن لائن لیکچر

شاہد اشرف

گزشتہ ایک برس کے دوران میں کووڈ کی وجہ سے آن لائن لیکچر دیتے ہوئے وہ کئی تجربات سے گزرا۔ پہلے پہل وہ اپنے دھیان میں لیکچر دیتا رہا۔ کچھ دنوں بعد اسے کیمرہ آف ہونے کے باوجود سٹوڈنٹس کی موجودگی اور عدم موجودگی کا اندازہ ہونے لگا۔ کبھی کبھی وہ کسی طالب علم کی موجودگی کی تصدیق کے لیے سوال بھی پوچھ لیتا تھا اور اس کا اندازہ درست نکلتا تھا۔ آہستہ آہستہ اسے مکمل ادراک ہونے لگا کہ کیمرہ آف ہونے کے باوجود کون سٹوڈنٹ موجود ہے اور کون لنک جوائن کرنے کے بعد سو گیا ہے ۔ ذہنی رابط برقی رابطے سے زیادہ موثر محسوس ہونے لگا۔ وہ کیمرہ آف ہونے کے باوجود دیکھنے پر قادر ہو گیا۔ کسی سٹوڈنٹ کا تصور کرتے ہی اس چہرے پر ہویدا اداسی ، بیزاری ، انہماک، دلچسپی اور نیم دلی سمیت دیگر کیفیات کا انکشاف ہونے لگتا تھا۔ وہ صرف غور سے آئی ڈی کی طرف دیکھتا اور سٹوڈنٹ کی ذہنی کیفیت ظاہر ہو جاتی۔ وہ مخاطب ہوئے بغیر کسی سٹوڈنٹ کی کیفیت پر رائے دیتا اور پھر متعلقہ سٹوڈنٹ کی حیرت کو انجوائے کرتا تھا۔ وہ دوران تدریس بہت سے تجربات سے گزرا ۔ اس کے دل میں ایک خیال زور پکڑنے لگا ۔ اس نے خیال کو جھٹکنے کی کوشش کی مگر ناکام رہا۔ اسی خیال کے زیرِ اثر ایک دن اس نے تمام سٹوڈنٹس کو کیمرے آن کرنے کا کہا ۔ سٹوڈنٹس اپنے اپنے کیمرے آن کر بیٹھ گئے ۔ وہ سب کو دیکھ سکتا تھا مگر اسی لمحے اسے شدید دھچکا لگا ۔ وہ کسی بھی سٹوڈنٹ کی کیفیت کو پڑھنے سے قاصر تھا ۔

 

 

 

الہامی کتب (تورات، زبور، انجیل اور قرآن) میں وارد اخلاقی رزائل کا تقابلی مطالعہ

Ethical Vices in Divine Books (Quran & Bible): A Comparative Study Morality implies values that distinguish between good and bad behavior. Divine religions have private behavioral value frameworks that are intended to guide followers in determining between right and wrong. Moral values are important in life because: If a person has never learned about moral values then how can he/she decide between the good and the bad. Moral values reflect an individual's character and spirituality. They help in building good relationships in personal as well as professional lives. In this article comparative study of ethical vices’ in light of divine books has been conducted. While doing so the behaviors like Pretention, Miserly, scrooge, Exuberance, Slandering, to lie, Faults/ Curiosity, make fun etc. Are being discussed and analyzed in order to highlight the moral teachings of the divine books. Texts from Torah, Psalm, Gospels and Quran on these vices are studied and analyzed. Study shows that divine books other than Quran have discussed immoral or wicked behavior briefly and just point out the vices but Quran and Sunnah discussed in detail about wicked behavior and also educate about the strategies that can steer you away from temptations and vices. Thus, the Qur’ᾱnic laws and injunctions make our life good and purposeful in this world and hereafter.

Processing-Efficient Distributed Adaptive Rls Filtering for Computationally-Constrained Platforms

Achieving fast convergence on an energy-limited and computationally-constrained platform still remains a dream in spite of magnificent advancements in Integrated Circuit (IC) technologies. For instance, in telephony, the echo cancellation re quires a high-definition adaptive-filtering algorithm that further needs a robust convergence performance while tracking the time varying uncertainties present in the communication link. Nevertheless, such high definition adaptive algorithm cannot be run on an energy-limited and computationally-constrained inexpensive platform. The research work in this thesis focuses to propose the low-complexity distributed adaptive filtering solution for energy-constrained platforms. The thesis is orga nized in three parts. Part-1 aims to develop a low-complexity MIMO channel estimation algorithm for MIMO communication system. Part-II and III pro vide the distributed and diffusion based adaptive signal processing solutions for computationally-constrained inexpensive platforms. The thesis begins with an overview of the adaptive algorithms with implementa tion constraints and then proceeds towards a comprehensive and detailed literature survey. The literature survey can be classified into two major areas, i.e. adaptive filter theory and adaptive algorithm implementation over low-cost platforms. Fur thermore, a channel model is presented with the consideration of two multipath components for MIMO communication environment. Taking it as a reference as channel model, a spatiotemporal low-complexity adaptive estimation algorithm is proposed by assuming time-variant block fading channel with fixed number of training symbols. The proposed algorithm exhibits better results than those shown by some notable least square algorithms in the literature. The effect of varying doppler rates on the convergence performance of the algorithm is thoroughly ob served to check the validation of the algorithm. Obtained simulated results show that the proposed algorithm entails low-complexity and provides independency on forgetting factor as compared to notable adaptive filtering algorithms. x In the second part of the thesis, a novel processing-efficient architecture of a group of inexpensive and computationally-constrained small platforms is proposed for a parallely-distributed adaptive signal processing (PDASP) operation. The pro posed architecture is capable of running computationally-expensive procedures like complex adaptive algorithms cooperatively. The proposed PDASP architecture operates properly even if perfect time alignment among the participating plat forms is not available. Complexity and processing time of the PDASP scheme are compared with those of the sequentially-operated algorithms. The comparative analysis shows that the PDASP scheme exhibits much lesser computational com plexity parallely than the sequentially-operated algorithms. Moreover, for high and low doppler rates, the proposed architecture provides a parallely-decreased processing time than the sequentially-operated MIMO algorithms. In part III, a novel distributed diffusion-based adaptive signal processing (DDASP) architecture for computationally-constrained small platforms is introduced. In the proposed DDASP architecture, the adaptive algorithm is diffused into the desired number of processing devices. The number of processing nodes that are used in DDASP architecture is dependent upon the number of MIMO channel streams as well as on the number multipath components. Therefore, having more nodes and diffusion mechanism, the proposed DDASP architecture exhibits lesser and linear computational complexity parallely on each processing node involved as compared to the proposed PDASP architecture.