昆士兰大学COMP3506/COMP7505课程作业辅导

2021-11-05 17:03    来源:留学在线       阅读量:47

COMP3506/7505 Homework Task 4

10 marks total

Overview

The goal of this problem set is to understand how to apply the algorithms and data structures

that have been taught in this course.

You will be required to write a series of algorithms to search through and analyse the data of a

social media feed. The following searches will need to be implemented:

• A search to find all posts made by a user between two dates

• A search to find the first post made by a specific user after a specific date

• A search to find the post with the nth highest upvotes

• A search to find all posts containing a specific text pattern

The intention is for most of the preprocessing to be performed in the constructor of the class,

allowing calls to these searches to be as fast as possible. Most of these searches are trivial to

implement using a brute-force algorithm. The majority of this task’s marks will be awarded for

choosing (and justifying the use of) efficient algorithms and data structures. Simply implementing

a brute force solution will result in very few marks, so you are encouraged to think about how to

maximise the efficiency of your solution before you begin writing code.

You have been supplied with a Java file, FeedAnalyser.java, which is responsible for loading the

data file and performing the searches. Stubs and Javadoc comments for each of the search methods

have been provided. Your implementation should strictly adhere to the documentation.

Other files you have been provided with include:

• FeedItem.java - a data class for storing information about feed items

• Util.java - containing utility methods for performing various functions, in particular for

file parsing (the methods in this class have already been used in the base code - you are not

required to use them in your solution)

You are permitted to use any programming constructs from the Java Collections Framework (or

any other standard Java library) for this task. You should however understand the underlying

implementation of any data structures/algorithms you use from the JCF.

1

Input File

The social media data that you are analysing will be initially stored in a csv file. The constructor

of FeedAnalyser.java is already partially-implemented to load this data from the file. You are

required to modify the constructor so this data is loaded into appropriate data structures.

Each line in the file represents a single post in the feed, and is formatted as follows:

id,username,date,upvotes,content

In this format:

• id is a unique integer identifier

• username is a string representing the user who posted this data

• date is the time the item was posted at (formatted in 24-hour time as dd/mm/yyyy hh:mm:ss)

• upvotes is the number of upvotes this post received (downvotes are possible and are represented

as negative integers)

• content is a (possibly very long) string containing the content of the post

For simplicity, you may assume the username and content fields contain only printable ASCII

characters (i.e. characters with ASCII values from 32 to 126 inclusive) and do not contain the

comma character. You can also assume that the data file is always formatted correctly so there is

no need to implement format validation.

Your Task

1. (4 marks) Implement the four searches as per their Javadoc specifications. You will likely need

to modify the constructor and class’ member variables to achieve an efficient implementation.

2. (6 marks) Describe any design choices that were made while implementing these searches.

In particular you should:

• Identify the underlying theoretical algorithms and data structures used by your code

and any of the classes from the JCF you have used

• State and briefly explain the worst-case time complexity of the constructor and each of

your searches in big-O notation

• If the worst-case time complexity differs from the expected-case time complexity, also

provide the expected-case complexity in big-O notation and explain why these cases

differ

• Justify why the chosen algorithms and data structures were optimal for efficiently implementing

each of the searches (you should discuss both the runtime and memory usage

of your implementation) - keep in mind that there may be no “perfect” solution and

certain implementations may have certain tradeoffs (which you should identify)

• Compare and contrast your design against other potential implementations (including,

but not limited to, brute force implementations or implementations with other tradeoffs)

2

Constraints

• You must write your solution in FeedAnalyser.java - do not modify any other files or

introduce new packages

• You may only use standard Java libraries

• You may introduce new member variables or helper methods but these should have the

strictest access modifiers possible

• Your answer to question 2 should be no longer than 4 pages (at size 12 font and standard

line spacing/margins)

Submission and Marking

Submit two files as a part of your submission. Your solution to question 1 should be in the file

FeedAnalyser.java. Your answer to question 2 should be in a PDF file named README.pdf. Do

not submit any other files or directories. To preserve anonymity, please do not include your name

in any submitted files (it is okay to include your student number).

Question 1 will be partially marked by an automated test suite with timeouts present on each of

the tests. A sample test suite has been provided in FeedAnalyserTest.java. This test suite is not

comprehensive - there is no guarantee that passing these will ensure passing the tests used during

marking. It is recommended, but not required, that you write your own tests for your algorithms.

Passing the tests also does not guarantee an efficient solution - tutors will make mark deductions

if your solution is inefficient. Marks may also be deducted for poor coding style.

You should submit README.pdf using Turnitin - this will be manually marked by a tutor. This file

should be electronically processed - handwritten solutions will not be accepted. Any asymptotic

bounds should be as tight as possible. Your analysis should be as concise as possible, while still

achieving the level of required detail (marks may be deducted for overly long answers). You are

encouraged to support your analysis, justification, and/or comparison with information from other

sources, but you must cite these appropriately (IEEE style is recommended). Using LATEX to

write your README is once again recommended but not required.

Late Submissions and Extensions

Late submissions will not be accepted. It is your responsibility to ensure you have submitted your

work well in advance of the deadline (taking into account the possibility of computer or internet

issues). See the ECP for information about extensions.

Academic Misconduct

Students are reminded of the University’s policy on student misconduct, including plagiarism. See

the course profile and the School web page for more information:

http://www.itee.uq.edu.au/itee-student-misconduct-including-plagiarism.

翻译:

COMP3506 / 7505家庭作业任务4

总分10分

概述

此问题集的目的是了解如何应用

本课程中所教授的算法和数据结构。

您将需要编写一系列算法来搜索和分析

社交媒体供稿的数据。需要执行以下搜索:

•搜索以查找用户在两个日期之间发布的所有帖子

•搜索以查找特定用户在特定日期之后发布的第一篇帖子

•搜索以查找第n个帖子最高投票

•搜索以查找所有包含特定文本模式

的帖子目的是要在类的构造函数中执行大多数预处理,

允许尽可能快地调用这些搜索。这些搜索大多数都不容易

使用暴力算法来实现。该任务的大部分标记将因

选择(并证明使用)有效的算法和数据结构而被授予。简单地实施

蛮力解决方案不会产生太多的痕迹,因此鼓励

您在开始编写代码之前考虑如何最大程度地提高解决方案的效率。

您已经获得了Java文件FeedAnalyser.java,该文件负责加载

数据文件并执行搜索。提供了每种搜索方法的存根和Javadoc注释

。您的实现应严格遵守文档。

您提供的其他文件包括:

•FeedItem.java-用于存储有关提要项目信息的数据类

•Util.java-包含用于执行各种功能(尤其是

文件解析)的实用程序方法(此类中的方法已经在基本代码中使用-

不需要在解决方案中使用它们)

被允许使用Java Collections Framework(或

任何其他标准Java库)中的任何编程构造来完成此任务。但是,您应该了解

从JCF使用的任何数据结构/算法的基础实现。

1、输入文件

您正在分析的社交媒体数据将最初存储在csv文件中。构造函数

FeedAnalyser.java的一部分已经实现,可以从文件中加载此数据。您

需要修改构造函数,以便将此数据加载到适当的数据结构中。

文件中的每一行都代表Feed中的单个帖子,其格式如下:

id,username,date,upvotes,content采用

这种格式:

•id是唯一的整数标识符

•用户名是代表发布此内容的用户的字符串数据

•日期是发布该项目的时间(格式为24小时制,格式为dd / mm / yyyy hh:mm:ss)

•upvotes是此帖子收到的支持的投票数(可以支持downvotes,并

用负整数表示))

•content是一个(可能很长的)字符串,其中包含帖子的内容

为简单起见,您可以假设用户名和内容字段仅包含可打印的ASCII

字符(即ASCII值介于32至126之间的

字符),而不包含逗号。您还可以假定数据文件始终正确格式化,因此

无需实施格式验证。

您的任务

1.(4个标记)根据其Javadoc规范实施四个搜索。您可能需要

修改构造函数和类的成员变量以实现有效的实现。

2.(6分)描述实现这些搜索时所做的任何设计选择。

特别是,您应该:

•确定代码使用的基础理论算法和数据结构

以及您使用过的JCF中的任何类

用最大O表示法陈述并简要说明构造函数的最坏情况时间复杂度和每个搜索

•如果最坏情况时间复杂度与预期时间不同复杂性,还

以大O表示法提供了预期的案例复杂性,并解释了为什么这些案例

有所不同

•证明为什么所选算法和数据结构最适合有效执行

每个搜索(您应该讨论自己的运行时和内存使用情况

)实现)-请记住,可能没有“完美”的解决方案,并且

某些实现可能会有一定的权衡(您应该确定)

•将您的设计与其他可能的实现(包括

但不限于蛮力实现或具有其他折衷的实现)进行比较和对比

2、约束

•您必须在FeedAnalyser.java中编写解决方案-请勿修改任何其他文件或

引入新的包

•您只能使用标准Java库

•您可以引入新的成员变量或辅助方法,但这些变量应具有

最严格的访问修饰符

•您对问题2的回答不得超过4页(以12号字体和标准

行间距/页边距)

提交和标记

提交两个文件作为提交的一部分。您对问题1的解决方案应在文件中

FeedAnalyser.java。您对问题2的答案应该在名为README.pdf的PDF文件中。不要

提交任何其他文件或目录。为了保持匿名,请不要

在任何已提交的文件中包含您的名字(可以包含您的学生编号)。

自动化测试套件将部分标记问题1,并且每个

测试都存在超时。FeedAnalyserTest.java中提供了一个示例测试套件。该测试套件并不

全面-无法保证通过这些测试将确保通过

标记过程中使用的测试。建议(但不是必需)为算法编写自己的测试。

通过测试也不能保证有效的解决方案-导师将扣除分数

如果您的解决方案效率低下。如果编码风格较差,也可以扣除标记。

您应该使用Turnitin提交README.pdf,这将由导师手动标记。该文件

应进行电子处理-手写解决方案将不被接受。任何渐近

边界都应尽可能紧密。您的分析应尽可能简明扼要,同时仍能

达到要求的详细程度(可能因过长的答案而扣分)。我们

鼓励您支持分析,论证和/或与其他

来源的信息进行比较,但是您必须适当地引用它们(建议使用IEEE样式)。

再次建议使用LATEX 编写自述文件,但这不是必需的。

逾期提交和扩展

将不接受逾期提交。您有责任确保

在截止日期之前很好地提交了工作(考虑到计算机或互联网

问题的可能性)。有关扩展的信息,请参阅ECP。

学术不端行为

提醒学生注意大学关于学生不端行为(包括窃)的政策。有关

更多信息,请参见课程资料和学校网页:

http : //www.itee.uq.edu.au/itee-student-misconduct-includes-plagiarism。

 昆士兰大学COMP3506/COMP7505课程作业辅导选择在线一对一辅导

"留学在线"的新闻页面文章、图片、音频、视频等稿件均为自媒体人、第三方机构发布或转载。如稿件涉及版权等问题,请与

我们联系删除或处理,客服邮箱756005163@qq.com,稿件内容仅为传递更多信息之目的,不代表本网观点,亦不代表本网站赞同

其观点或证实其内容的真实性。

Baidu
map